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Description 

The present invention relates to the field of molecular biology In particular it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, conligs, ORFs, fragments, probes, primers and related polynucleotides 
5 thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R. P , The Staphyloco- 
ccus as a h/lolecular Genetic System, Chapter 1 , pgs 1 -37 in (VIOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R Novick, Ed VCH Publishers, New York (1990)) Species differ from one another by 80% or more, by hybridi7a1ion 
>o kinetics, whereas strains within a species are at least 90% identical by the same measure 

The species Staphylococcus aureus, a gram-positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below 

Human Health and S. Aureus 

15 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et al, MEDICAL rvllCROBIOLOGY 
IVIosby-Year Book Europe Limited, London, UK (1993)) It is an etiological agent of a variety of conditions, ranging in 
severity from mild to fatal A few of the more common conditions caused by S aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint infections, neonatal conjunctivitis.osteomyelitis, skin infections, surgical wound 
?o infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below 

Burn wounds generally are sterile initially l-lowever, they generally compromise physical and immune barriers to 
^5 infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable badoria results in mixed colonization at the injury site Infection may be restricted to the non-viablo 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S. aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
30 produce severe septicaemia. 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread bolowthe cutaneous 
35 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenos. Cellulitis can lead to systemic infection 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically is caused by a mixture of 
S. aureus and microaerophilic streptococci It causes necrosis and treatment is limited to excision of the necrotic tissue 
The condition often is fatal. 

40 Eyelid infections 

S au/'eusisthe cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

'»5 Food poisoning 

Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E) Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria 
so Although the toxins are known, their mechanism of action is not understood. 

Joint infections 

S aureus infects bone joints causing diseases such osteomyelitis. 

55 

Osteomyelitis 

S aureus is the most common causative agent of haematogenous osteomyelitis The disease tends to occur in 
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children and adolescents more than adults and It is associated with non-penetraling injuries to bones. Infection typically 
occurs in the long end of growing bono, hence its occurrence in physically innmaturo populations Most often, Infection 
IS locali,?ed in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long growing 
bones 

5 

Skin infections 

S aureus is the most common pathogen of such minor skin Infections as abscesses and boils Such Infections 
often are resolved by normal host response mechanisms but they also can develop into severe internal Infections 
10 Recurrent infections of the nasal passages plague nasal carriers of S aureus 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body Infection of such wound thus poses a grave risk to the patient 
'5 S aureus Is the most important causative agent of Infections in surgical wounds. S. aureus is unusually adopt at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells thon are necessary to cause 
infection In normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease) This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
2S produce exfoliation(also called scalded skin syndrome toxin) Although the bacteria Initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe Injury In young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin 
The disease can bo caused by S. aureus infection at any site, but it is loo often erroneously viewed exclusively as a 
35 disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can bo fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable 
-'s Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully The emergence of penicillin-resistant strains of S. aureus did not take long, however. Most strains of S 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It Is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
so pcnclllinolc acid, and thereby destroys antibiotic activity Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance 

Methlcllllns, Introduced in the 1960s largely overcame the problem of penicillin resistance in S. aureus These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
5S make penicillin a good substrate (or inactivating lactamases However, methicillin resistance has emerged in S aureus. 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline 
chloramphenicol, macrolides and lincosamides. In fact, melhicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types ot drug resistance in S aureus has been elucidated (See Lyon etal , Micro- 
biology Reviews 52_ B8-1 34(1987)). Generally, resistance is mediated by plasmlds, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S aureus genome itself, 
s Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 

and increasingly persistent forms of resislanco begin to emerge Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed 

Molecular Genetics of Staphylococcus Aureus 

w 

Despite Its importance in, among other things, human disease, relatively little is known about the genome o( this 
organism 

Most genetic studies of S aureus have been carried out using the the strain NCTC8325, which contains prophages 
psill psi12 and psilS, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
7S the prophages. 

Those studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like 

Physical characterization of the genome has not been carried out in any detail Pattee ef al published a low res- 

20 olution and incomplete genetic and physical map ot the chromosome of S aureus strain NCTC 8325. (Pattee et al. 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI. F{ P Novick, Ed VCH Publishers, New York (1 990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001 which, respectively confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE") 

25 The map was of low resolution, even estimating the physical size of the genome was difficult, according to the 

investigators. The size ot the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to bo introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown Among 

30 the few genes that have been identified, most have not been physically mapped or characterized in detail Only a very 
few genes of this organism have been sequenced, (See, for instance Thornsberry, J. . Antimicrobial Chemotherapy 2A 
SuppI C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

It IS clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 

35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding ot the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controlling S. aureus infections. 

40 There Is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS; 1-5,191 . 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
45 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-5, 191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
50 preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

The nucleotide sequence of SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS:1-5,191 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
ss to magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM. electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means Such systems are designed to identify commercially 
important (ragmonts of the Staphylococcus aureus genome. 

Another embodiment of the present invention Is directed to fragments preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked ORF. 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs " 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EfvlFs 
10 found 5' to the ORFs, can bo used in numerous ways as polynuclootido reagents For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
'5 lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, Into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present Invention The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell 
^0 The present Invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art. routinely 
may bo utilized to obtain any of the polypeptides and proteins of the present invention For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them 

The Invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an In- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers and techniques 
such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention 
■35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma Is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present Invention further provides methods of Identifying test samples derived from cells which express one 
of the ORFs of the present Invention, or a homolog thereof. Such methods compnse Incubating a test sample with one 
^0 or more of the antibodies of the present Invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom 

In another embodiment of the present Invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
■fs which comprises; (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention; and (b) one or more other containers comprising one or more of the following;wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the Isolated proteins of the present invention, the present invention further provides methods of obtaining 
and Identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of; (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratories working with 
this organism and for a variety of commercial purposes. fVlany fragments of the Staphylococcus aureus genome will 
55 be Immediately Identified by similarity searches against GenBank or protein databases and will be of Immediate value 
to Staphylococcus aureus researchers and for Immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology tor elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, 
including the ability to identify genes within large segments of genomic DNA the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
s genomic and molecular phylogeny 

FIGURE 1 IS a block diagram of a computer system (1 02) that can bo used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention Both Macintosh and Unix 

10 platforms arc used to handle the AB 373 and 377 sequence data files, largely as described in Korlavage et al , Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer So- 
ciety Press, Washington D C. (1993) Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database As- 

'5 sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using oxtrsoq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seqjilter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 

20 generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplishGd by processing contigs with zorf , The ORFs are searched against S, aureus sequences 
from Genbank and against all proto n sequences using the BLASTN and BLASTP programs, described in Altschul ot 
ai, J Mol Biol 21_5 403-410 (1990)) Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1-3,, 

25 The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 

of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS:1-5,191, (As used herein, the "primary sequence" rotors to the nucleotide sequence represented by the lUPAC 
nomenclature system ) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
30 present invention provides the nucleotide sequences of SEQ ID NOS:1-5,191, or representative fragments thereof, in 
a form which can bo readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-5,191 ° refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database 
Preferred representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs"), 
35 expression modulating fragment ( EMFs") and fragments which can bo used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs"). A non-limiting identification of preferred representative fragments is provided in Tables 
1-3 

As discussed in detail below, the Information provided in SEQ ID NOS: 1-5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments" of interest, including open reading frames encoding a large variety of Staphylococcus 

aureus proteins 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS;1-5, 191, However, once the 

45 present invention is made available (i.e., once the information in SEQ ID NOS:1 -5,1 91 and Tables 1 -3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1-5,191 will be well within the skill of the art The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 

50 art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resulting nucleotide 

55 sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

As discussed elsewhere hererin polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA Detailed methods for obtaining 
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libraries and (or sequencing are provided below for instance A wide variety ol Staphylococcus aureus strains that can 
bo used to prepare S aureus gonomic DNA for cloning and (or obtaining polynucleotides of the present invention aro 
available to the public from recognized depository Institutions such as the American Type Culture Collection (ATCC") 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat How- 
s ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID IMOS:1-5,191 Nearly all will be at least 99% 
Identical and the great majority will be 99.9% identical 

Thus the present invention further provides nucleotide sequences which are at least 95%. preferably 99% and 
most preferably 99 9% identical to the nucleotide sequences of SEQ ID NOS V5J91 in a form which can be readily 
10 used, analyzed and intorproted by the skilled artisan 

IVIethods (or determining whether a nucleotide sequence is at least 95% at least 99% or at least 99.9% identical 
tothe nucleotide sequences of SEQID NOS 1-5 191 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc Natl Acad. Set. USA 85: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences The BLASTN program also can be used to generate 
15 an identity score of polynucleotides compared to one another 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS 1-5,191 a representative (ragment thereof, or a nucleotide 
sequence at least 95%, preferably at least 99°o and most preferably at least 99 9% identical to a polynucleotide se- 
quence of SEQ ID NOS 1-5,191 may be "provided" in a variety of mediums to (acilitate use thereot As used herein, 
Oprovidod" re(ors toa manufacture other than an isolated nucleic acid n^olcculo, which contains a nucleotide sequence 
o( the present invention, i e , a nucleotide sequence provided in SEQ ID NOS:1-5,191, a representative (ragment 
thereof, or a nucleotide sequence at least 95%, preterably at least 99% and most preferably at least 99.9% identical 

25 to a polynucleotide of SEQ ID NOS: 1 -5,191 . Such a manutacture provides a large portion of the Stephy/ococcus aureus 
genome and parts thereof (e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media As used herein, "computer readable media" rotors to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape, optical storage media such as CD- ROM, electrical storage media such as 
RAM and ROM: and hybrids of these categories, such as magnetic/optical storage media A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can bo used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nuclootido sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention 

As used herein, "recorded" refers to a process for storing information on computer readable medium A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

40 to generate manufactures comprising the nucleotide sequence information o( the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

•<5 readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats (e g , text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

50 Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS:1 -5,1 91 , a representative fragment thereof, or a nucleotide sequence at least 95%. preferably at least 99% and 
most preferably at least 99 9% identical to a sequence of SEQ ID NOS 1-5.191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes. 

55 The examples which follow demonstrate how software which implements the BLAST (Altschul et al.,J Mol Biol 

215:403410 (1990)) and BLAZE (Brutlag et ai. Comp Chem 17 203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful motabolitos 
The present invention further provides systems, particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify among other things commercially important frag- 
s ments of the Staphylococcus aureus genome 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyse the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), Input means, 
output means, and data storage means A skilled artisan can readily appreciate that any one of the currently available 

'0 computer-based system are suitable for use in the present invention 

As stated above, the computer-based systems of the present invention comprise a data storage moans having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and Implementing a search means 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 

'5 present invention, or a memory access moans which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 

20 match a particular target sequence or target motif A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software includes, but is not limited to, fVlacPattorn (EfVIBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems 

25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more ammo acids. A skilled artisan can readily recognize that the longer a target sequence is, the loss likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 

30 protein processing, may be of shorter length 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif There are a variety of target motifs known in the art Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences Nucleic acid target motifs include, but are not limited 

35 to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences). 

A variety of structural formats for the input and output means can be used to Input and output the information In 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 

40 target sequence or target motif and identifies the degree of homology contained in the identified fragment 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
moans to identify sequence fragments of the Staphylococcus aureus genome In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschul et al , J. Moi Biol. 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 

45 readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system Illustrative of embodiments of this aspect of present 
invention The computer system 102 Includes a processor 106 connected to a bus 104. Also connected to the bus 104 

50 are a main memory 1 09 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, afloppy disk drive, a CD-ROIvl drive, a magnetic tape drive, etc A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, efc ) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114 The computer system 102 Includes 

55 appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once 
it is Inserted Into the removable medium storage device 114. 

A nucleotide sequence of the present Invention may be stored in a well known manner in the main memory 108, 
any of the secondary storage devices 110. and/or a removable storage medium 116 During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc ) reside in mam memory 
10S. in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
'0 modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs) 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
'5 means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly the term refers to the nucleic acid molecules having the sequences sot out in SEQ ID NOS:1-5,191 , to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99,9% identical in sequence thereto, also as set out above 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
^0 include, but are not limited to methods which separate constituents of a solution based on charge, solubility or size 
In one embodiment. Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length These fragments can then bo used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below Primers flanking, for example, an ORF such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191, Well 
2S known and routine techniques of PGR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA Thus, given the availability of SEQ ID NOS:1-5,191, the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the soquoncos of SEQ ID NOS: 1-5, 191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention, 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA, 

As used herein, an "open reading frame." ORF, means a series of triplets coding for ammo acids without any 
termination codons and is a sequence translatable into protein 

Tables 1 , 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GenefVlark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus conUgs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
40 an S. auret/s nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0 01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly 
45 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1 996. 

In each table, the first and second columns identify the ORF by respectively contig number and ORF number 
within the contig; the third column indicates the reading frame taking the first 5' nucleotide of the contig as the start of 
the -1-1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 
so In Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclalure are available from the National Center for Biotech- 
nology Information, Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
S5 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 

In Table 3, the last column, column six, indicates the length of each ORF in ammo acid residues. 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 
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1 , 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 60% If, for example at position 5, the ammo acids moictios, although not identical, wore "similar" 
(/.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter Thus, 
s for Instance. Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" In each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent lltoraturo highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful Thus, a skilled 

10 artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1 -3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or £U¥ 

'5 As used heroin, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence Is altered by the presence of the EI\^F EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements) One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 

20 the ORFs provided in Tables 1-3 An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 

25 of the present invention Further the two methods can be combined and used together 

The presence and activity of an EMF can be confirmed using an EMF trap vector An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor which can be Identified or assayed when the EMF trap 
vector Is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 

30 expression of an operably linked marker sequence A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector The vector is then transformed into an appropriate 
host using known procedures and the phonotypo of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic formal which de- 
40 termines amplification or hybridization selectivity 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also Include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided In SEQ ID NOS I -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS: 1-5, 191, with a sequence 
45 from another isolate of the same species. 

Furthermore, to accomodate codon variability, the Invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid Is expressly contemplated. 

Any specific sequence disclosed heroin can be readily screened for errors by resequencing a particular fragment, 
so such as an ORF, In both directions (/ e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureus origin isolated by using part or all of the fragments 
In question as a probe or primer 

Each of the ORFs of the Staphylococcus aureus genome disclosed In Tables 1 , 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotide reagents in numerous ways For example, the sequences can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus Also particularly preferred are ORFs thai can be used to distinguish between strains of Sfa- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains 

In addition, the fragmonts of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA Triple helix- formation optimally results in a shut-off of RNA transcription from DNA. 
s while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to bo complementary 
to a region of the gene involved in transcription for triple-helix formation, or to the mRNA itself for antisense inhibition 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 

10 and involve routine procedures Triple holix techniques are discussed in, for example. Loo ct al., NucI Acids Ros 6 
3073 (1979), Cooney etal.. Science 241 : 456 (19B8), and Dervan etal., Science 251 1360 (1991). Antisense tech- 
niques in general are discussed in. for instance, Okano, J Neurochem. 56 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton FL (1988)) 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 

'5 lococcus aureus genomic fragmonts and contigs of the present invention. Certain preferred rocombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF For vectors comprising the EMFs of the present invention, the vector may further 

20 comprise a marker sequence or heterologous ORF operably linked to the EMF 

Largo numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the rocombinant constructs of the present invention. The following vectors are provided by 
way of example Useful bacterial vectors include phagescript, PsiX174, pBluescript SK and KS (+ and -), pNHBa, 
pNH16a, pNHIBa, pNH46a (available from Stratagene): pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (available 

2S from Pharmacia) Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTI , pSG (available from Stratagene) 
pSVK3, pBPV pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lad. lacZ, T3, 17. gpt. lambda PR, and trc Eukaryotic promoters include CMV immediate early HSV 

30 thymidine kinase early and late SV40. LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of Xhe Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be Introduced Into the host by a variety of well established techniques that are standard In the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation. which are described in, 
for Instance, Davis, L. etal.. BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

40 A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(In the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 

45 Intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an Identical polypeptide sequence 
Preferred nucleic acid fragments of the present invention are the ORFs depicted In Tables 2 and 3 which encode 

A variety of methodologies known in the art can be utilizedtoobtalnany oncof the Isolated polypeptides or proteins 
so of the present invention At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein One skilled in the art can readily employ well-known methods for isolating polpeptidos and 
proteins to isolate and purify polypeptides or proteins of the present Invention produced naturally by a bacterial strain, 
or by other methods. Methods (or Isolation and purification that can be employed In this regard include, but are not 
limited to. Immunochromatography, HPLC, size-exclusion chromatography. Ion-exchange chromatography, andimmu 
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no-attinity chromatography 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
5 does not produce or which the cell normally produces at a lower level Those skilled in the art can readily adapt pro- 
cedures lor introducing and expressing either recombinant or synthetic sequences into oukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins ol the present invention. 

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells. CV-1 cell COS cells, and Sf9 cells, as well as prokaryotic 
'0 host such as E. co// and B. subtilis The most preferred colls are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

"Recombinant," as used herein means that a polypeptide or protein is derived from recombinant {e g , microbial 
or mammalian) expression systems "Microbiar refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e g., yeast) expression systems As a product, "recombinant microbiardefines a polypeptide or protein essen- 
'5 tially free of native endogenous substances and unaccompanied by associated native glycosylation Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral oporon. 

"Recombinant expression vehicle or vector" refers toa plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a potyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, whore recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally The cells can bo prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

40 from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal , MOLECULAR CLONING A LABORATORY MANUAL, 2^"^ Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989). the disclosure of which is hereby 
incorporated by reference in its entirely 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

15 transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as S-phosphoglycerate kinase (PGK) alpha- 
factor, acid phosphatase, or heat shock proteins, among others The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

50 of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e g , stabilization or simplified purification of expressed recombinant product 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

55 functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coli, B subtilis, Salmo- 
nella typhimurium an6 various species within the genera Pseudomonas, Streplomyces. and Staphylococcus Others 
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may, also be employed as a matter of choice 

As a representative but non-limiting example, useful oxprosslon vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived (rom commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017) Sucfi commercial vectors Include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEIVI 1 (available from Promega Biotec, IVIadison, 
Wl, USA). These pBR322 "backbone" sections arc combined with an appropriate promoter and the structural sequence 
to bo expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible is doreprossod or induced by appropriate means (e g., temperature shift or 
'0 chemical induction) and cells arc cultured for an additional period to provide for expression of the induced gene product. 
Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification 

Various mammalian cell culture systems can also be employed to express recombinant protein Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23 1 75 
'5 (1981), and other cell lines capable of expressing a compatible vector, for example, the CI 27, 3T3, CHO, HeLa and 
BHKcell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

2S cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
dlagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides". Such im- 

30 munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful According to Izard, J. W. et al., Mol Microbiol. 13, 765-773, (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S , J Bac- 
teriol. 174, 7345-7351 ; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

"5 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +^ site relative to the cleavage site. 

Studios of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

50 contained the sequence L-(A,S)-(G,A)-C at positions -3 to -t-1, relative to the point of cleavage (Hayashi, S and Wu, 
H C Lipoproteins in bacteria J Bioenerg Biomembr 22, 451-471; 1990) 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 

55 amino acid sequence (Fischetti, V A Gram-positive commensalbacteria deliver antigens toelicit mucosaland systemic 
immunity ASM News 62. 405410; 1 996) The conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined The amino acid sequence of this region is L-P-X-T-G-X, where X is any ammo acid 

Amino acid soqucnco similaritios to protDins ol known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall Such proteins are well known in the art and Include "lipoprotein", "perlplasmic", or "antigen" 
s An algorithm for selecting antigenic and Immunogenic Staphylococcus aureus polypeptides Including the foregoing 

criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection ol several ORFs which are predicted to be outermom- 
brane-assoclated proteins These proteins are Identified in Table 4, below, and shown in the Sequence Listing as SEQ 
ID NOS: 5. 192 to 5, 255 Thus the ammo acid sequence of each of several antlgenicSfapAiy/ococcL/saureuspolypeptides 
10 listed in Table 4 can bo detorminod, for example, by locating the ammo acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID In Tables 1, 2, or 3. and finding the corresponding nucleotide sequence in the sequence listing 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 
'5 in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence are excluded for enhanced in vitro expression of the polypeptides Furthermore, any highly hydrophobic 
20 amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5, 192-5.255, may obtain the complete predicted ammo 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 
25 ORF listed in Tables 1,2 and 3 and found in the sequence listing 

Accordingly, polypeptides comprising the comploto ammo acid of an immunologically useful poiypoptlde soloctod 
from the group of polypeptides encoded by the ORFs identified m Table 4, or an ammo acid sequence at least 95% 
identical thereto, preferably at least 97% Identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 
30 amino acid sequences shown in the sequence listing as SEQ ID NOS:5.1 91 -5,255. or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention Polynucleotides encoding the foregoing polypeptides also form part of the present 

In another aspect, the invention provides a peptide or polypeptide comprising an epitopc-bearing portion of a 

35 polypeptide of the invention, particularly those epitopo-boaring portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" Is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 

40 instance, Geysen et al , Proc. Natl. Acad. Scl. USA 81 3998- 4002 (1 983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e. that contain a region of a protein 
molecule to which an antibody can bind), it is well known In that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutcliffe, J. G., Shinnick, T M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 

45 predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to Immunodominant regions of intact proteins (i e , immunogenic epitopes) nor to the amino or 
carboxyl terminals Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 

so Wilson et al , Cell 37:767-778 (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non -limiting examples of antigenic polypeptides or peptides 
that can bo used to generate S aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

55 4 below These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means 
See, eg, Houghten, R A (1 985) General method for the rapid solid-phase synthesis of large numbers of peptides 
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specificity of antigen-antibody interaction at the level of individual amino acids Proc Natl Acad. Sci. USA 82: 
5131-5135; this "SimultanGous Multiple Peptide Synthesis (SMPS)" process is further described in U S Patent No 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art See, for instance, Sutcliffe et al , supra, Wilson et al. supra. 
5 Chow, M etal , Proc Natl Acad Sci. USA 82 910-91 4; and Bittle, F. J et al., J Gen Virol 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response 
when the whole protein is the immunogon, arc identified according to methods known in the art See, for instance, 
Geysen etal , supra Further still, US Patent No 5,194.392 to Geysen (1990) describes a general method of detecting 
or determining the sequence of monomers (ammo acids or other compounds) which is a topological equivalent of the 

10 epitope (i e . a "mimotopo") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest More generally U S Patent No 4,433.092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest Similarly U S Patent No 5,480.971 to Houghten, R A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear CI -C7-alkyl peralkylated oligopeptides and sets and libraries of such 

'5 peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first ammo acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame induced within the epitope For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5 1 92, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described As used herein, substantially equivalent can refer both to nucleic acid 
and ammo acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 

40 sequence disclosed herein as a probe or as primers, and techniques such as PGR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology Preferred homologs in this 
regard are those with more than 90% homology Especially preferred are those with 93% or more homology. Among 

45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those With 99.9% homology or more. It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1-5,191 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:l-5 191 can be used to prime DNA synthesis and PGR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as, for example, Innis etal., PGR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

55 When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS 1-5,191, one skilled in the art will recognize that by employing high stringency 
conditions {e g , annealing at 50-60°C in 6X SSPC and 50% (ormamide, and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified By employing lower stringency 
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conditions {e g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0 5X SSPC). 
sequences which aro greater than 40-50% homologous to the primer will also bo amplified 

When using DNA probes derived from SEQ ID NOS:1-5,191, or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS 1-5.191 , for colony/plaque hybridization, one skilled in the ait will recog- 

s r\\7.e that by employing high stringency conditions (e g , hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and 
washing at 50- 55°C in 0 5X SSPC), sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions {e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42''C in 0 5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained 

'0 Any organism can bo used as the source tor homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

15 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 

20 the identification is made for example, to ferment a particular sugar source or to produce a particular metabolite A 
variety of reviews illustrative of this aspect of the invention are available. Including the following reviews on the industrial 
use of enzymes, for example. BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et a/, Eds., Elsevier 
Science Publishers, Amsterdam. The Netherlands (1 985) A variety of exemplary uses that illustrate this and similar 

2S aspects of the present invention are discussed below 

1. Blosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 

30 macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
ammo acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 

35 can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1 -3 and SEQ ID NOS 1-5,191 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism Such enzymes include amylases, glucose oxidases, and catalase. 

40 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depoctinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits A detailed review of the proteolytic enzymes used in the food Industry is provided in Rombouts 
era/., Symbiosis2J_: 79 (I986)and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY Whitak- 

45 er et al., Eds., American Chemical Society Symposium Series 389; 93 (1 989) . 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly glucose, galactose, fructose and xylose, can be used in 
industrial fermentation Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 

so oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger etal. Biotechnology 6(A) . Rhine etal., Eds . Verlag 
Press, Weinheim, Germany (1984) 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form for the deoxygenation of beer. See. for instance, Hartmeir et al , Biotechnology Letters V 21 (1979) The most 

55 important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used m the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al , Eds , 
Academic Press New York (1985) In addition to industrial applications. GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology tor analyzing syrups from starch and 
collulosG hydrosylatos. This application Is described In Owusu otal., Biochom. ot Biophysica Acta. 872 83 (1 986), for 
instance. 

The main sweetener used in the world today Is sugar which comes from sugar beets and sugar cane. In the field 

5 of industrial enzymes, the glucose Isomerase process shows the largest expansion In the market today Initially, soluble 
on^iymes were used and later immobilized enzymes wore developed (Kruogor ot al , Biotechnology, The Textbook of 
Industrial Microbiology, Sinauor Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40 307 (1988) 

10 Proteinases, such as alkalino sorino proteinases, are used as detergent additives and thus represent one of the 

largest volumes o( microbial enzymes used in the industrial sector Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman ef a/ , Acid Proteases Structure Function and Biology Tang, J , ed Plenum Press. New York (1977) and 
Godfrey et al , Industrial Enzymes, MacMillan Publishers, Surrey UK (1983) and Hepner eta!., Report Industrial En- 

'5 zymes by 1 990, Hcl Hepner & Associates, London (1 986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by for 
instance, Macrae etal., Philosophical Transactions of the Chiral Society of London 310:227 (1985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies etal., Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemisls hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides. 

30 and carbon bond forming reactions such as the aldoi reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud eta/., Chemistry in Britain 

35 (1987), p. 127 

Aminotransferases, enzymes involved in the biosynthesis and metabolism of ammo acids, are useful in the catalytic 
production of amino acids The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of ammo transferases for ammo acid production is provided by Roselle-David, Methods 
40 of Enzymolociv 136:479 (1987) 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins The proteins of the present 
invention can further be used to generate an antibody Which selectively binds the protein. Such antibodies can be 
SO either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pre- 
ss ducing the desired antibody are well known in the art (Campbell, A M., MONOCLONAL ANTIBODY TECHNOLOGY 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Elsevier Science Publishers, Am- 
sterdam, The Netherlands (1984); St Groth etal., J. Immunol. Methods^. 1-21 (1980), Kohlerand Milstein. Nature 
256: 495-497 (1975)), the trioma technique, the human B- cell hybridoma technique (Kozborefa/ , Immunology Today 
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4 72 (1983), pgs 77-96 of Cole et al , in MONOCLONAL ANTIBODIES AND CANCER THERAPY. Alan R Liss, Inc 
(1985)). 

Any animal (mouse, rabbit etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide Methods for immunization are well known In the art. Such methods include subcutaneous or interperitoneal 
5 injection of the polypeptide One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity Methods of increasing the antigenicity of a protein are well known in the art and include, but 
10 arc not limited to coupling the antigen with a heterologous protein (such as globulin or galactosldase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
'5 an antibody with the desired characteristics These include screening the hybndomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et al., Exp. Coll Res. V75: 1 09-1 24 (1 988)). 

Hy bridomas secreting the desired antibodies are cloned and the class and subclass Is determined using procedures 
known in the art (Campbell, A M , Monoclonal Antibody Technology: Laboratory Techniques In Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 
20 Techniques described for the production of single chain antibodies (U S Patent 4,946,778) can be adapted to 

produce single chain antibodies to proteins of the present invention 

For polyclonal antibodies, antibody containing antisora is isolated from tho immunizod animal and is screened lor 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present Invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
25 be detectably labelled through the use of radioisotopes affinity labels (such as biotin, avidin, etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamino, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger et al., J HIstochem. Cytochem 18 315 (1970); Bayer, E. A. ©fa/., Meth. Enzym. 62:308 (1979); Engval, 
E. etal , Immunol 109 129 (1972); Coding J W J. Immunol. Meth. 13:215 (1976)). 
30 The labeled antibodies of the present invention can be used for in vitro, in vivo, and In situ assays to identify cells 

or tissues In which a fragment of the Staphylococcus aureus genome Is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
35 are well known in the art (Weir, D. M. et al., "Handbook of Experimental Immunology" 4th Ed , Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986), Jacoby, W. D. etal., Meth. Enzym. 34 Academic Press, N. Y. (1974)). 
The immobilized antibodies of the present Invention can be used for in vitro, in vivo, and in situ assays as well as for 
Immunoaffinlty purification of the proteins of the present invention. 

40 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs, antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 

'*5 the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay One skilled in tho art will recognize that any one of the commonly available hybridization, amplification 

so or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock. G. R. etal., Techniques in 
Immunocytochemistry. Academic Press, Orlando, FL Vol 1 (1982), Vol 2 (1983). Vol 3 (1985): Tijssen. P, Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry: PCT publication W095/32291, and 

55 Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985), all of which are hereby incorpo- 
rated herein by rotcrenco. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine The test sample used in the above-described method will vary based 
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on the assay (ormat, nature ot the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells arc well known in the art and can be readily be 
adapted In order to obtain a sample which Is compatible with the system utilized. 

In another embodiment of the present Invention, kits are provided which contain the necessary reagents to carry 
5 out the assays ot the present invention. 

Specifically, the invention provides a compartmenlali^od kit to receive, in close continomont, one or more containers 
which comprises (a) a (irst container comprising one of the Dfs. antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following wash reagents, reagents capable of detecting 
presence of a bound DF nntigon or antibody 
10 In detail, a compartmentalized kit includes any kit in which reagents arc contained in separate containers Such 

containers include small glass containers, plastic containers or strips of plastic or paper Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another Such containers will include a container which will accept the test sample a container which 
'5 contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents Include labelled nucleic acid probes, labelled secondary antibodies, or In the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 
20 present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the Isolated proteins of the present invention, the present invention further provides methods of obtaining 
25 and Identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein descnbed. 
In general, such methods comprise stops of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present Invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention 

Alternatively agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
40 skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby ef a/ , Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W H Freeman, NY (1 992), pp 289-307, and Kaspczak etal.. Biochemistry 28:9230-8 (1 989). or pharmaceutical 
agents, or the like. 

45 In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention As described above, 
such agents can be randomly screened or rationally designed/selected Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same ElVlF tor expression control. 

so One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee ef al , Nucl. Acids Res. 5:3073 (1 979) Cooney 

55 et al., Science 241 :456 (1 988); and Dervan et al., Science 251 : 1 360 (1 991 )) or to the mRNA itself (antisense • Okano, 
J Neurochom 56:560 (1991); Oligodeoxynucleotldcs as Antisense Inhibitors of Gone Expression, CRC Press, Boca 
Raton, FL (1988)) Triple hellx-tormation optimally results in a shut-off of RNA transcription from DNA, while antisense 
RNA hybridization blocks translation of an mRNA molecule into polypeptide Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antiscnso and triple helix-forming oligonjcleotldes, and other DNA binding agents 

5. Pharmaceutical Compositions and Vaccines 

5 

The present invention further provides pharmaceutical agents whicfi can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions As used herein the "pharmaceutical agents of the present invention" refers the pharmaceutical 

'0 agents which are derived from the proteins encoded by the ORFs of the present invention or arc agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent Is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question The pharmaceutical agents of the present Invention can modulate the growth or pathogenicity 

'S of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present Invention. Some agents will modulate the growth or path- 
ogenicity by binding to an Important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 

20 of the present invention and serve as a vaccine The development and use ot vaccines derived from membrane asso- 
ciated polypeptides are well known In the art Tho inventors have Idontitied particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccinos Such immunogonic polypeptides are described above and sum- 
marized in Table 4, below 

As used herein, a "related organism" Is a broad term which refers to any organism whose growth or pathogenicity 
25 can be modulated by one of the pharmaceutical agents of the present Invention In general, such an organism will 
contain a homolog of tho protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous. Intraperitoneal, intramuscular, subcutaneous intranasal or intradermal 
30 routes The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, thoy are administered In an amount of at least about 1 mg/kg body weight 
and In most cases they will be administered In an amount not in excess of about 1 g/kg body weight per day In most 
cases, the dosage Is from about 0 1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

35 The agents of the present invonllon can bo used In native form or can be modified to form a chemical derivative 

As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed In, among other sources, 

40 REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability biological half- 
life, hydrophoblclty, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 

45 also may be effected in this way and can be assayed by methods well known to the skilled artisan 

The therapeutic effects of the agents of the present Invention may be obtained by providing the agent to a patient 
by any suitable means (e.g., inhalation, intravenously. Intramuscularly, subcutaneously enterally or parenterally). It Is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue In which the growth of the organism is to be controlled To achieve an effective blood concentration, the 

50 preferred method is to administer the agent by injection. The administration may be by continuous Infusion, or by single 
or multiple Injections. 

In providing a patient with one of the agents of the present Invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex. general medical condition, previous medical 
history, etc In general. It is desirable to provide the recipient with a dosage of agent which is in the range of from about 
55 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present Invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects ot each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, Of following tho administration of the other agent 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism 
s The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose 

When provided prophylactically, the agent(s) are provided In advance of any symptoms indicative of tho organisms 
growth Tho prophylactic administration of tho agont(s) serves to provent, attenuate, or decrease the rate of onset of 
any subsequent infection When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

'0 toms of the infection and to increase the rate of recovery 

The agents of the present invention are administered to a subject, such as a mammal, or a patient in a pharma- 
ceutlcally acceptable form and In a therapeutically effective concentration, A composition is said to be "pharmacolog- 
ically acceptable' if its administration can be tolerated by a recipient patient Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered Is physiologically significant. An agent is physiolog- 

'5 ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

Tho agents of the present invention can be formulated according to known methods to prepare pharmaceutically 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle Suitable vehicles and their formulation, inclusive of other human proteins, e, 
g., human serum albumin, are described, for example, in REfyllNGTON'S PHARMACEUTICAL SCIENCES. 16'*^ Ed,, 

20 Osol, A., Ed., tvlack Publishing, Easton PA (1 980) In order to form a pharmaceutically acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invontlon, together with a suitable amount of carrier vehicle 

Additional pharmaceutical methods maybe employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidono, othylonevinylacetato, mothylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetatc copolymers. Alternatively instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared for example, by 
coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poly(mothylmGthacylato) microcapsules, rcspeclivoly, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, microemulsions, nanoparticlos, and nanocapsulesor in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (19B0). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such container(s) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

40 or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration 

In addition, the agents of the present invention may be employed in conjunction with othertherapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of tho sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow The examples 
50 are provided by way of illustration Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill In the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 

LIBRARIES AND SEQUENCING 

s 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, Pq, that any given base in a sequence of size L. in nucleotides, is not sequenced after 

10 a certain amount, n, in nuclootidos, of random sequence has boon determined can bo calculated by the equation Pq 
- e-"^, where m is Un, the fold coverage " For instance for a genome of 2.8 fvlb, m-1 when 2 8 Mb of sequence has 
been randomly generated (IX coverage) At that point, Pq = e'^ = 0 37 The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined Thus, at one-fold coverage, 

'5 approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced When 14 Mb of sequence 
has been generated, coverage is 5X for a .2.8 Mb and the unsequenced fraction drops to 0067 or 0.67%. 5X coverage 
of a 2 8 Mb sequence can be attained by sequencing approximately 1 7.000 random clones from both insert ends with 
an average sequence read length of 410 bp. 

Similarly the total gap length, G, is determined by the equation G = Le^™, and the average gap size, g, follows the 

20 equation, g = Un Thus, 5X coverage leaves about 240 gaps averaging about 82 bp In size in a sequence of a poly- 
nucleotide 2 8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Gonomicsg: 231 (1988). 

2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required The following library construction procedure was developed to achieve this end. 

Staphylococcus aureus DNA was prepared by phenol extraction, A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate 10 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min at 0°C in a Branson 

30 Model 450 Sonicator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolvod in 500 ul TE buffer 

To create blunt-ends, a 100 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°C in 200 ul BAL31 buffer The digested DNA was phenol-extracted, ethanol-pre- 
cipitated, redissolvod in 100 ul TE buffer, and then si70-fractionatod by electrophoresis through a 1.0% low molting 

35 temperature agarose gel. The section containing DNA fragments 1 .6-2.0 kb in size was excised from the gol, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolvod in 20 ul of TE buffer for ligation to vector, 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments. 2 ug pUC18 DNA (Pharmacia) cut with Smal 

■fo and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as Insert (i), vector (v), v-i-i, 
v+2i, v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 

■^5 into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min at 37° C in a reaction mixture 
(50 ul) containing the v+i llnears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i llnears were 
dissolved in 20 ul TE. The final ligation to produce circles was earned out in a 50 ul reaction containing 5 ul of v+i 
llnears and 5 units of T4 ligase at 14°C overnight After 10 min. at 70°C the following day, the reaction mixture was 

so stored at -20''C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.co// host cells deficient in all 
recombination and restriction functions (A Greener Strategies 3 (1 ) 5 (1 990)) were used to prevent rearrangements. 
55 deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice A 1 7 ul aliquot of 1.42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 mm A 1 ul aliquot of the final ligation was added to the cells and incubated on ico for 30 mm. The cells were hioat 
pulsed for 30 sec. at 42° C and placed back on Ice for 2 min The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptono, 5 g yeast extract, 0,5 g NaCI, 1 5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0 4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%), 1 mlMgCl2(1 M), and 1 ml MgS04/1 00 ml SOB agar The 15 ml top layer was poured just prior to plating 
Our titer was approximately 100 colonies/10 ul aliquot of transformation 

to All colonics wore picked for template preparation regardless of size Thus, only clones lost duo to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

IS 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with SPrime •> SPrime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification Average template concentration was 
determined by running 25% of the samples on an agarose gel, DNA concentrations were not adjusted 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library An unamplified library was 

constructed in Lambda DASH II vector (Stratagcne) Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, IX Sau3AI buffer, 20 units SauSAI for 6 mm at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

2S the recommended ligation reaction One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage wore plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM butler and chloroform treatment). Yield was about 
2.5x109 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulfoxide The phage titer is approximately 1x10^ pfu/ml. 

Mini-liquid lysates (0 lul) are prepared from randomly selected plaques and template is prepared by long range 
PGR. Samples are PGR amplified using modified T3 and T7 primers, and Elongase Supermix (LTI) 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Porkin-Elmer 9600 thcrmocycler with Applied Biosystems PRISryi Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the I\/113 forward (M13-21) and the rvl13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for Ml 3-21 and M13RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for Ml 3-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

■fs 4. Protocol for Automated Cycle Sequencing 

The sequencing was carried out using Hamilton Microstation 2200. Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers The Hamilton combines pro-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 
50 primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler Thirty consecutive cycles of linear amplification (i e.., one 
primer synthesis) steps were performed including denaturatlon, annealing of primer and template, and extension, i e , 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay 

55 Two sequencing protocols were used one for dye-labellcd primers and a second for dye-labelled dideoxy chain 

terminators The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the lour 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis, detection, and base- 
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calling ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing Sequencing can be done with both plasmid and PCR-genoratcd templates with both dye- 
primers and dye- terminators with approximately equal fidelity although plasmid templates generally give longer usable 
sequences 

5 Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 

per day Electrophoresis was run overnight (ABI 373)orfor2 1/2hours(ABI 377) following the manufacturer's protocols. 
Following oloctrophorosis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling The lane-tracking was confirmed visually Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality Trailing sequences of low quality were removed and the sequence 

'0 Itself was loaded via software to a Sybase database (archived daily to Bmm tape) Leading vector polylinkor sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction 

INFORMATICS 

75 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed (For review 
see, for instance, Kerlavage et al . Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 

20 Sciences, IEEE Computer Society Press, Washington D C . 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whoreovcr possible and to reduce user error The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 

25 on a Unix platform, it was necessary to design and implement a variety of multi- user client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort 

2. Assembly 

30 An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 

fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10^ fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps The 
number ol potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. 

35 Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S , 
Methods in EnzymoloQv 164 : 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 

40 end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 

45 coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

50 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92 0), using the BLASTN search 
methodtoidcntify overlaps of 50 or more nucleotides with at least a 95% identity Those ORFs with nucleotide sequence 
55 matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GonPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0 01 are shown in Table 2 The table also lists assigned functions based on the closest match in the databases 
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ORFs of at least 1 20 ammo acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Tablo 3 

ILLUSTRATIVE APPLICATIONS 

5 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is Isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
'0 E coll, or can by chemically synthesized Concentration of protein in the final preparation is adjusted, for oxampio, by 
concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybrldoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a microliter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E , Meth. Enzymol 70 419 (1980), and modified meth- 

25 ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use 
Detailed procedures for monoclonal antibody production arc doscnbod in Davis, L. et al. Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species For example, small molecules tend to be loss immunogenic than other and may require the use of 

35 carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigenadministered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J etal, J. Clin. Endocnnol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall See, for example, Ouchterlony, O. etal.. Chap. 19 inHandbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973) Plateau concentration of antibody is usually in the range of 0. 1 to 0 2 mg/ml of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, □., Chap. 42 in; Manual of Clinical Immunology second edition. Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they arc useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

50 the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS 15,191 

can bo used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately Ihe same The PGR primers and amplified DNA of tfiis Example find use in the Examples that follow. 
4. Gene expression from DNA Sequences Corresponding to ORFs 

5 A fragment of the Staphylococcus aureus genome provided in Tables 1-3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego California) If desired, to enhance expression and facilitate 

'0 proper protein folding, tho codon context and codon pairing of tho sequonco may be optimi7ed for the particular ex- 
pression organism as explained by Hatfield ef a/., U. S. Patent No. 5,082,767 incorporated herein by this reference 
Tho following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment Bacterial ORFs generally lack a poly A addition signal The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 

'5 Bgll and Sail restriction endonucloase enzymes and incorporating it into the mammalian expression vector pXTI (Strat- 
agene) for use in eukaryotic expression systems. pXTI contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs In the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for PstI incorporated into the 5' primer and Bglll at 
tho 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. Tho purified fragment obtained from 
the resulting PCR reaction is digested with PstI, blunt ended with an exonuclease, digested with Bglll, purified and 
ligated to pXTt , now containing a poly A addition sequence and digested Bglll 

25 The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc , Grand Island, 

Now York) under conditions outlined in the product specification Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G418 (Sigma, St. Louis, Missouri). Tho protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Altornativly and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 

35 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
tho formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene) This vector encodes a rabbit globin Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

40 texts such as Davis etal., cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in W/ra translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

•*5 scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference. 
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ORF 


SEQID NO 


BLAST 


Antigenic 


Regions 










HOMOLOG 


Region 1 


Region 2 


Region 3 


Region 4 


168_6 


5192 


lipoprotein 


36-45 


84-103 


152-161 


176-185"' 


238_1 


5193 


chrA 


21-39 


48-58 


84-95 


232-249 


51_2 


5194 


OppB qene product (B. sub' 


20-36 


70-79 


100-112 


121-131 


278_3 


5195 


lipoprotein 1 


20-29 


59-73 


85-97 


162-171 


276_2 


519G 


lipoprotein 


21-33 


65-74 


177-186 


211-220 " 


45_4 


5197 


ProX 


28-37 


59-69 


85-100 


120-129 


315_8 


5198 


hypothetical protein 


45-54 


88-97 


182-192 


243-253 


154_15 


5199 


unknown 


31-40 


48-58 


79-88 


95-104 


228_3 


5200 


': unknown 


25-38 


40-52 


64-74 


80-89 


228_6 


5201 


' unknown i 


29-41 


89-101 


128-143 


173-184 


50_1 


5202 


unknown 


21-33 


52-61 


168-182 


197-206 


112_7 


5203 


iron-binding periplasmic 


21-31 


58-67 


92-101 


111-120 


442_1 


5204 


unknown 


30-39 


91-100 


122-137 


182-192 


66_2 


5205 


unknown 


50-59 


104-1 16 


127-136 


167-182 


304_2 


5206 


Q-bindinq periplasmic I 


19-28 


48-57 


75-84 


103-116 


44_1 


5207 


hypothetical protein 


27-36 


86-95 


r29-l 38 


192-201 


161_4 


5208 


SphX 


27-44 


149-161 


166-175 


201-210 


46_5 


5209 


cmpC (permease) 


21-33 


61-70 


83-92 


100-109 


942_1 


5210 


traH [Plasmid pSK41] 


83-92 


109-118 


127-142 1 


5_4 


5211 


ORF (5. aureus) 


12-22 


87-96 


m-120 


151-160 


20_4 


5212 


peptidogiycan hydrolase (S: 


24-34 


129-138 


141-150 


161-171 


328_2 


5213 


lipoprotein (H. flu) ; 


81-90 


123-133 


290-299 i 


520_2 


1 5214 


Ifibronectin binding protein 


44-54 


63-79 


81-90 


95-110 


771_1 


5215 


'emml gene product (S. py< 


30-39 


65-82 


96-106 


112-121 


999_1 


5216 


predicted trithorax prot. (D 


7-16 


120-129 


157-166 




853_1 


5217 


ORF2136 (Marchantia polyr 


43-52 


88-97 


102-111 




287_1 


5218 


psaA homolog 


13-22 


28-44 


72-82 


114-124 


288_2 


5219 


cell wall enzyme 


14-23 


89-98 






596_2 


5220 


penicillin binding protein 2b 


40-49 


59-68 


76-87 


106-115 


217_5 


5221 


fibronectin/fibrinoqen bindii 


28-37 


40-49 


62-71 


93-111 


217_6 


5222 


fibronectin/fibrinogen bp 


10-19 


31-40 


54-52 


73-92 


528_3 


5223 


myosin cross reactive prote 


4-13 


29-47 


60-73 


90-99 


171_11 


5224 


EF 


20-31 


91-110 






63_4 


5225 


penicillin bindinq protein 2b 


12-21 


59-68 


95-104 




353_2 


5226 




46-55 


62-71 






743C1 


5227 


29 kDa protein in fimA regi' 


23-32 


68-79 


94-103 


175-184 


342_4 


5228 


Twitching motility 


10-19 


48-60 


83-92 


1 11-121 


S9_3 


5229 


arabinogalactan protein 


97-106 


132-141 


158-167 


1 80-"l 89' 


70_6 


5230 


nodulin 


36-45 


48-57 


137-160 


179-188 


129_2 


5231 


glycerol diester phosphodie 






55-74 


97-106 


58_5 


5232 


PBP (S. aureus) 


26-35 


70-79 


117-126 


152-161 


188_3 


5233 


MHC class II analog (S. aure 


72-81 


94-103 


1 1 5-124 


136-145 


236_6 


5234 


histidine kinase domain (Die 


24-33 


52-57 


81-94 


106-121 


310_8 


5235 


clumping factor (S. aureus) 


59-71 


77-86 


93-102 


118-127 


601_1 


5236 


novel antigen/0RF2 (S. aui 


45-54 


91-104 


1 08-1 1 7 


186-195 


544_3 


5237 


ORF YJRISIc (S. cerevisae: 


76-90 


101-111 


131-1 40 


154-164 


662_1 


5238 


MHC class II analog (S. aure 


22-32 


71-80 


89-98 


114-122 


87_7 


5239 


5' nucleotidase precursor (' 


29-45 


62-71 


105-114 


125-137 


'l20_1 


5240 


B6SG qene product (B. sub 


102-111 
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Table 4 



ORF 




Antigenic 


Regions^ 


(cont) 








Region 5 


Region 6 


Region 7 


Region 8 


Region 9 


Region 1_0_ 


168_6 


244-272 


303-315 










238_1 


. 250-269 


291-301 


308-31 7 i i 


^51^2 


• 140-152 


J88-20_8 


2 IJ -220 


2.56-266 _ 


__273^283 




278_3 


' 198-209 












276_2 


255-268 






45_4 


1 177-199 


221-230 


234-243 


268-279 


284-293 


.._304:313_ 
















154_15 


148-157 


177-187 


202-211 I 


228_3 


' 101-119 


139-154 


166-181 i 


: 


228_6 I 


i 




50_1 1 




J 






112_7 


1 136-149 


197-21 1 


218-229 


253-273 


... ....^ 




442_1 


1 199-210 ■ 


247-257 


264-277 


287-309 






66_2 ! 










304_2 


1 178-187 


250-259 










44_1 ! 










1614 ; 










46_5 


1 131-141 


162-176 


206-215 


243-252 


264-273 


■ 285-294 


942_1 1 : 








5_4 


I 1 89-205 


230-239 


246-264 


301-318 


340-354 


378-387 


20_4 


202-212 


217-234 


260-275 


314-336 


3'66-373 


380-391 


328_2 ' 










520_2 1 










77^_^ 


i 145-154 










999_1 ! 








853_1 i 






287_1 


1 154-164 






1 


288_2 ! 


1 


■ 


596_2 


121-130 








217_5 


244-253 


259-268 


288-297 


302-31 1 1 


217_6 


144-158 


174-183 


188-197 


207-21 6 


226-242 




528_3 














1 71_1 1 














63 4 , 


t 






353_2 I 


1 






743-_l 


' 197-207 










34-2 4 


1 ■■ i 


69_3 


195-211 






70_6 


206-215 


263-272 


291-301 


331-340 


358-371 


390-414 


129_2 


117-127 


141-157 


168-183 


202-21 1 


222-231 


261-270 


58_5 


184-203 


260-269 


275-299 


330-344 


372-381 


424-433 


188_3 ! 


236_6 


138-147 


163-172 


187-198 


244-261 


268-278 


308-317 


310_8 


131-140 


144-153 


177-186 


1 90-1 99 


204-213 


216-227 


601 _1 


208-218 












544I3 


170-179 


184-193 


224-235 


274-287 


327-336 


352-361 


662_1 














87_7 














120 1 1 
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Table 4 



ORF 




Antigenic 


Regions 


(cont) 








Region 1 1 


Region 1 2 


Region 13 


Region 1 4 


Region 1 5 


Region 16 














238^1 i 1 ; 


5i_2 ; : ; 


278_3 














276_2 i 




45_4 














31o_8 








154_15 i 








228_3 












228_6 


I 

























1 1 2_7 














" 442_1 















66_2 










304_2 












44_1 

161_4 
46_5 


306-315 " 


r - - - 


























942_1 
5_4 ' 


"393-407 ' 










4T6-426 _ 
410-419 


456-465 








20_4 


396-405 


461-481 






328_2 










520_2 










771_1 










999_1 




■ 






853_1 






1 


287_1 












288_2 










596_2 








217_5 
2 1 7_'6 








I 












528_3 1 








171_n ; 








63_4 ! 






353_2 ! 






743-_l ! 






342_4 










69_3 








70_6 


453-471 


506-515 








129_2 
58_5 


296-315 






















188_3 1 ' 


236_6 


358-377 


410-423 


428-439 


442-457 


467-476 


480-493 


310_8 


238-251 


256-275 


281-290 


296-310 


314-333 


3^8-347 


60i_l 1 ; ' ! 


S44_3 














662_1 












87_7 
120_T'' 
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Table 4 



ORF Antigenic 


Regions (cont) 






Region 1 7 Region 1 8 


Region 19 Region 20 


Region 21 


Region 22 


158_6 




238_1 








51_2 








278_3 1 








276_2 i ; ■ 




L . . ..... 






316 8 ■ ' 








1 54_1 5 1 


.... ,^ 








; 






111=1 • ■ 








50 T ^ " 


■ : 






1 1 2_7 ! 


: 




442_1 
















304_2 ! 
















— "^r^ — i 








46~~5 1 






942"! ' 






5_4 I 






20_4 






328_2 i 






520_2 1 






771_1 ■ 








999_1 > 








853_1 i 








287 1 i 






288_2 1 






596_2 j 




217_5 : 1 




217_6 








528_3 1 








171_11 




63_4 i 


1 1 


353_2 1 




743:11 ' ! ^ 


342_4 1 : • 


69_3 








70_6 ' 








1 29_2 








58_5 ; 


188_3 








236_6 


310_8 357-3G6 370-379 


429-438 443^52 


478-487 


^5 1j:S60 


G01_1 








544 3 


6S2_1 








87_7 








" 120 1 
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Table 4 



Antigenic Regions ( cont) 



Region 23 Re g ion 24 ' Re g ion 25 \ Re gio n 26 Re gion 27 Region 28 



45_ 4 

316__8 

' 154_15 



-304-2 
"44_1 



_46__5 _ 
942_1 



353^; 
743_ 



622-632 670-685 708-718 823-836 858-867 877-8 
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ORF Antigenic 


^Regions 


(cont) 


Region 29 

TS8_6 


Region^30_ 




238_f 




51_2 






278_'3 






276_2 




45_4 




316_3.. 






154_r5 






228_3 






228_6 






50_1 






1 12_7 






442_1 






56_2 1 






304_2 






44_1 




161_4 




46_5 I 




942_1 






5_4" 






20_4 






328_2 i 


520_2 






771_1 ! 


999_1 


853_1 1 


287_1 


288_2 ' ' 


596_2 






217_S 






217_6 






528_3 1 ! 


171_11 ! 


63_4 1 




353_2 . ' : 




74r_i 




342_4 i ! ! 


6913 ' 


70_6 : 


129_2 


58_5 


188_3 






236_6 


310_8 






601 _1 






544_3 






66~2_1 






87_7 






120_i 
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Table 4 



ORF 




BLAST 


Antigenic 


Regions 










HOMOLOG 


Region 1 


Region 2 


Region 3 


Region 4 


46i/r 


__524J_ 


aldehyde dehydrogenase 


8-17 


35-52 


83-96 ; 


112-121 


63_4 


5242 


glycerol ester hydrolase (P. 


9-25 


57"'-73 


93-107 


123-133 


1 74_6 




5243 ketopantoate hydroxymeth 


71-80 


203-212 


242-254 


265-274 


206_16 5244 


ornithine acetyltransferase 


1 10 


34-43 


54-63 


.194-210 


267_1 


,5245 


NaH-antiporter protein (E. t 


120-129 


332-347 


' 398-408 




322_1 


,5246 




58->5 


153-164 


"203-2yi i 


"264^284 ~ 


415_2 


5247 


transport ATP^ binding prot( 


108-126 


218-227 


298-308 


315-334 


214_3 


'5248 


2-nitropropane dioxygenase 


123-136" 


216-233 


283-292 : 


297-306" 


587_3 


i 5249 


clumping factor 


5-14 


43-54 ^ 


59-68 ; 


76-95 


685_1 


;5250 


sign?! P*PlMase 


59-68 


72-81 


8G-95 1 


99-108 


5 4.3"^' 


iSZSI 


_fibroriectin binding protein 1 


23-32 


37-46 


50-59 


89-98 


54_4 


^5252 


fibronectin binding protein 1 


43-52 


66-75 


_-95-.JP4 ; 


. 1 47-1.56 


54_5 


1 5253 


fibronectin binding protein 1 


4'9-6b 


81-90 






54_6 


r52S4 


fibronectin binding protein 1 


55-71 


82-97 


139-158_4 


7.175:186^ 


328_1 


'5255 


lipoprotein (H. flu) 


11-20 


61-70 


'96-105 1 





Table 4 



ORF 




Antigenic 


Regions '(cont) 








Region S 


Region 6 


Region 7 


Region 8 


Region 9 


Region 10 


_ - - 


215-242 


333-352 


376-385 


41G-432 


471-487 


63_4 


145-154 


191-202 


212-223 


245-265 


274-283 


291-300 


174_6 














206_16 
267_1 


239-_259_ 


275:284__ 











322_1 
4T5_2 


298-319_ 
344-'353_ 


350-359 
371-380 


""'39 5"404~ 


456-465 


"~ 486^495" 


5A8-527' . 


214_3 


3"l8-337 


365-375 










537_3" 


106-115' 


142-151 


156-166 


173-182 


186-198 


_'2p4-2"l3_ 


685_1 


; 113-122 


130-145 










54_3 


T 128-138 


185-194 ' 


21 7-226 


251-260 


268-277 


__295.-305_ 


54_4 


175-188 


191-200 


203-212 


220-229 






' 54_S 


-J 1 












54_6 


i Z20-230 


287-304 


317-326 


344-353 


364-373 


" 378^87~ 


328_1 ! 
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Table 4 



5 


— ORE 




Antigenic 


Regions 














Region 1 1 


Region 1 2 


Region 13 


Region 1 4 


\ Region_1 5 


Region 17 




46_1 

_ 63^4 

1 74^6 


~ 3'06-3T5~ 


319-328~ 


: ~ 3"66-~376 ' 




74537462^' 


467-4767"" 


W 


206_1 6 








_ 










267_1 
















322_1 
41 5_2 


539-555~ 














214_3 












IS 


587_3 


217^T~ 


: 2 78-287 _ 


1 318-327 


332-342 


1351-360 ;377-386 




685_J 

54_3 

saJ^ 


" 31 6-32 5~ 


; 

j ^329-345 


! — 

355-372 

-i 


387"316ll^ 


416-425 


'438-448 _ 




54_5 


i 


i 


1 








20 


54_6 


J 396-407 


. 427-436 


1 514-531 


541-550 


i 569^8" ' 


' 61 2-62'2 




328_1 








I 1 1 


25 








Table 4 










ORF 




Antigenic 


Regions 


(cont) 










Region 18 


Region 1 9 


Region 20 


Region 21 


Region_22 


Region 23 


30 


46_1 















63_4 


485-500 


513-525 












174_6 
















206_16 
















267_1 














35 


322_1 
415_2 
















214_3 
















587_3' 


"^96'-405 


"4"26~4"42" 


459^4'7d" 


485-494 


505-514 


53"l"-562 




685_1 














40 


54^3_ 


U55-462 


472-491 


'517-536 








54_4 

















S4 5 1 ! 1 










54_6 


1639-648 


673-681 


1703-715 


723-732 





: 772-788 




328_1 
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Table 4 



ORF 




Antigenic 


Regions 


^cont) 






46_1 


Region 24 


Region 25 


Region 26 


Region 27 


Region^ 28 


Region 29 


63_4 














174_6 













.- 

















267~1 














322_1 
a'] 5_2 














214_3 














587_3 


567-578 


Sf^ 601 


607-840 


844-854 


858-870 


'877"886 


685_1 














54.^3 

54'_4 
5415 














54_6 


793-802 


811-826 


834-848 


866-876 


893-903 


■907-918 


328rf 













Table 4 

25 



ORF 1 Antigenic Regions 


(cont) 


Region 30 Region 31 

46_1 
63_4 " ' 


174_6 

_206_16_j 

267_1 ' 1 " ■■ 




322_1 ; 
415_2 ' 




214_3 i ' " 




587_3 889-911 j927-936 




685_1 1 




54_3 ' 1 
54_4 i i 
" 54_5 "'"i "i 




54_6 925-944 951-997 




328_1 ! 1 1 



45 



50 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 2 0850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4 Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05 -JAN- 1996 



(2) INFORMATION FOR SEQ ID NO:l: 



45 



so 



ss 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE SS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

10 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

,5 aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GGAGCATATA 3 00 

^° CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 3 60 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 4 20 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 4 80 

25 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 540 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 7 20 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 7 80 

35 TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 84 0 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

40 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 102 0 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 1080 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

45 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 1260 

so TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 13 80 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 144 0 

55 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC TGAGTTATCT ATTATAAAAA AATAAACATA 1550 

GACTTATGAA AAATCTCTCA TAAATCTATG TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

CGGGCGCTTC TTATTTATAC AAATCTAATT TAATACTTTT AAATACAGGT ATATTTTCgC 168 0 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT CTACAGTCAA AATATCTGCG GATTCATTTA 174 0 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA TGACAAACAT ATTATTTTCA ATTGCACGTG 1860 

CCTTTAGTAA TGAATGCCAA TGTTGAAGAC GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

CAATTTTAGC ACCACTACGA GCAGGATATC TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

AGATAAGTTG GGTCACATAA GTACCGTCAG ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

CAGCGGTTAA AAATTCATGC TCTCTTAACA TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

TAATCAGCTG GCCACTTTTA TTCACACTAA AAGCTGTATT AAATATTTGA TTGTTTCTAA 216 0 

TGTTAGAAAC TGACCCAGCT ACGATATCGA CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

ATGAAAAACT TTGTCCTAGA TTATTATCTG CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATTATTCCA CATTTCAGGT AAAACGACTA CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

ACCATTGCGT TATTTGAGTT TCATTTTTAG AACTATCTCC AAAAACAATC GGTAATTGAT 24 00 

AAATTTGGAC TTTCATAACA TCACATCCTT GATAGATCTT ATATATAACT TACTAAAAGT 2460 

TATGTTGAAA CGCAAAAAAC GAGCACAAGA CATAAAATCA AAGTCCTAGG CTCTACAAAG 2520 

TTATATTGAC AGTAGTTGAT GGGGCCCCAA CATAGAGAAA TTGGAACACC AATTTCTACA 2580 

GACAATGCAA GTTGGGGTGG GCTCTAACAT AAAGAAATAC TTTTTCTTTA GAAATTAGTA 264 0 

TTTCTTATAC ATGAGTTTTA CTCATGTATT CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

AATGTGTAAG AACTACTACA TAATGAATAA CTAATGATTC TTTATCATTT CTGTCCCATT 276 0 

CCTAACAATA TATTGATTAT TTTTTTATTA CGAAACGATC TTCCACTGGA TTAAATGTTT 2 820 

TTTCGCCAGC AGCTTCACGA ATATCACCAA ATGGCATTTG AGCAATAAGT TTCCAACTTT 2 880 

TAGGAATATT AAATTCATTT GAAGTCATCT CATCAACAAG TGGATTATAG TGTTGTAATG 294 0 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG TCCAAATTGC AAATTGATGC ATGGCATTTG 3000 

TTTGAGTTGA CCATATTGCA AAATTATCAT AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3 060 

TTACAACATC TTGATCTTCA TAAAACAAAA TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TTTTTTGTTC AGTTGGCTCG AAATCACGAT TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

TTGTGTTATC CCAAAATTTA TTATTGTTGT CATTTAACAA GAGAACAATT CTAGTTGATT 3240 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TCCATTGCAT 3360 

TGTCAAAAGT CATTGCTTTT TTATCTTTTT TAAATAAGCC CATAATTATT GCTCCTTCTT 34 20 

TAGTAAAGAA TACTTAATAG ACTAAGTATA AAATTTATAC TCGTACTTGT AAAGCAATAT 34 8 0 

TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATGCATTTTC 354 0 

AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3 600 

TGTCATATCA TTGGTTTAAG AAAATGTTAC TTTCAACAAG TATTTTAATT TTAAGTAGTA 3660 

GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 37 20 

CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 3780 

ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 3 84 0 

GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3 900 

GTACAGATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAACCAGATC 3 960 

CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 4 020 

CGGATCCAAA ACCAGATCCA GATAACCCGA AACCAAATCC AGATCCAAAA CCAGACCCAG 4080 

ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 4140 

ATCCAAAACC AGACCCTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 42 0 0 

GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATGCTTCAG 426 0 

ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 4320 

CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAGTATA 438 0 

ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 4 44 0 

AAGACATTTA TAATATTATT CGAAAACAAa ATTTCAGCGG AAATGCATAT TTAAATGGAT 4 500 

TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 4 56 0 

ACTATOGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 4620 

CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 468 0 

AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 474 0 

CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 4800 

TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 4 860 

CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 4 92 0 

TCCGTGGCTT TTTTCAAAGC CTCAGGATTA AGTAATTGGA ATATAACGAC AAATCCGTTT 4 98 0 

TGTAACATAT GGATAATAAT TGGAACAGCA AGCCGTTTTG TCCAAACATA TGCTAATGAA 504 0 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 51 GO 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 52 8 0 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 534 0 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 5400 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 54 60 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 564 0 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

GTTATGACTT GAAATTTTGA CCATVATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 576 0 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 58 95 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 6 0 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

CCGATTATAC AAATTAATTT GGGAACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 3 00 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 480 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 54 0 
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AAAGCGTAAC 


TATGTCAAAT 


TAGAAAGTAA 


GCGTTTTGTT 


CCTACTGAGT 


TGGGAGAAAT 


560 


AGTTCATGAA 


CAAGTGAAAG 


AATACTTCCC 


AGAGATTATT 


GATGTGGAAT 


TCACAGTGAA 


720 


TATGGAAACG 


TTACTTGATA 


AGATTGCAGA 


AGGCGACATT 


ACATGGAGGA 


AAGTAATCGA 


780 


CGGTTTCTTT 


AGTAGCTTTA 


AACAAGATGT 


TGAACGTGCT 


GAAGAAGAGA 


TGGAAAAGAT 


840 


TGAAATCAAA 


GATGAGCCAG 


CCGGTGAAGA 


CTGTGAAATT 


TGTGGTTCTC 


CTATGGTTAT 


900 


AAAAATGGGA 


CGCTATGGTA 


AGTTCATGGC 


TTGCTCAAAC 


TTCCCGGATT 


GTCGTAATAC 


960 


AAAAGCGATA 


GTTAAGTCTA 


TTGGTGTTAA 


ATGTCCAAAA TGTAATGaTG 


GTGACGTCGT 


1020 


AGAAAGAAAA 


TCTAAAAAGA ATCGTGTCTT 


TTATGGATGT 


TCGAAATATC 


CTGAATGCGA 


1080 


CTTTATCTCT 


TGGGATAAGC 


CGATTGGAAG 


AGATTGTCCA 


AAATGTAACC 


AATATCTTGT 


1140 


TGAAAATAAA 


AAAGGCAAGA 


CAACACAAGT 


AATATGTTCA 


AATTGCGATT 


ATAAAGAGGC 


1200 


AGCGCAGAAA 


TAATATTTTT 


ATTTCCTAGA 


TACATTTTAA 


GATTGTTAAA 


TAGAATCATT 


1260 


AGTGAATCTT 


ATTTTAAAGA 


TAGTAAAGGA 


TTAATCTAAA 


TAAGTGCGGA 


TAATATAAAC 


1320 


ATAACAACAT 


AATTAAmAGA 


CATAAATGAC 


aATAAAAGGA 


GTATAGAAAT 


GACTCAAACT 


1380 


GTAAATGTAA 


TAGGTGCTGG 


TCTTGCCGGT 


TCAGAAGCGG 


CATATCAATT 


AGCTGAAAGA 


1440 


GGAATTAAAG 


TTAATCTAAT 


AGAGATGAGA 


CCTGTTAAAC 


AAACACCAGC 


GCACCATACT 


1500 


GATAAATTTG 


CGGAACTTGT 


ATGTTCCAAT 


TCATTACGCG 


GAAATGCTTT 


AACTAATGGT 


1560 


GTGGGTGTTT 


TAAAAGAAGA 


AATGAGAAGA 


TTGAATTCTA 


TAATTATTGA AGCGGCTGAT 


1620 


AAGGCACGAG 


TTCCAGCTGG 


TGGTGCATTA 


GCAGTTGATA 


GACACGATTT 


TTCAGGTTAT 


1680 


ATTACTGAAA 


CACTTAAAAA 


TCATGAAAAT 


ATCACAGTTA 


TTAATGAAGA 


AATTAATGCC 


1740 


ATTCCAGATG 


GATACACAAT 


TATCGCAACA 


GGACCACTTA 


CTACAGAAAC 


CCTTGCGCAA 


1800 


GAAATAGTGG ACATTACTGG 


TAAAGATCAA 


CTTTATTTCT 


ATGATGCGGC 


TGCTCCAATT 


1860 


ATTGAAAAAG 


AATCTATTGA 


TATGGATAAA 


GTTTACTTAA 


AGTCCCGTTA 


TGATAAAGGT 


1920 


GAAGCTGCAT 


ATTTAAACTG 


TCCTATGACT 


GAGGATGAAT 


TTAATCGCTT 


TTATGATGCA 


1980 


GTATTAGAAG 


CTGAAGTTGC 


GCCTGTAAAT 


TCATTTGAAA 


AAGAAAAATA 


TTTCGAGGGT 


2040 


TGTATGCCTT 


TTGAAGTAAT 


GGCAGAACGC 


GGACGCAAGA 


CATTACTATT 


TGGACCAATG 


2100 


AAACCAGTAG 


GATTAGAAGA 


TCCAAAGACT 


GGGAAACGTC 


CTTATGCGGT 


GGTTCAATTA 


2160 


AGACAAGATG 


ACGCTGCTGG 


TACACTCTAC 


AATATTGTTG 


GCTTCCAAAC 


GCATTTAAAA 


2220 


TGGGGAGCTC 


AAAAAGAAGT 


CATTAAATTA 


ATTCCAGGCT 


TAGAAAATGT 


TGATATTGTT 


2280 


AGATATGGTG 


TGATGCATAG 


AAATACCTTC 


ATTAATTCAC 


CGGACGTATT 


AAACGAGAAA 


2340 
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TATGTAGAAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 24 60 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2 520 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 258 0 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 2 64 0 

TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 270 0 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 276 0 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2820 

AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 294 0 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 30 60 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 324 0 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 3300 

CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 3 360 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 34 20 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 34 8 0 

TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 354 0 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAQCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 3660 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3780 

CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 384 0 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3900 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3 960 

CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4 02 0 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4 080 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 414 0 
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ATTACAACAG 


TTTAGTGGTA 


ACTTAGAAAG 


AGCTGCTGTT 


GAATTGGCAC 


AAGAATGGCG 


4260 


AGGCGATAAA 


CAATTACGTC 


AATTAGAAGC 


TATGCTAATT 


GTAATGGATA 


AAGATGCTAT 


4320 


TTTAGTTGTC 


AGTGGAACTG 


GCGAAGTTAT 


TGCTCCAGAT 


GATGACCTTA 


TCGCTATTGG 


4380 


ATCAGGAGGC 


AACTACGCAT 


TAAGCGCAGG 


ACGTGCATTG 


AAACGCCATG 


CATCGCATTT 


4440 


GTCTGCTGAA 


GAAATGGCAT 


ATGAGAGCTT 


GAAAGTAGCG 


GCTGATATTT 


GTGTCTTTAC 


4500 


CAACGATAAT 


ATTGTTGTCG 


AAACACTATA ATAATCAGAG 


CACGATAAAT 


AATTACGAGC 


4560 


AATTAATTTT 


AGTTAAAAGA 


CGGAGGAATG 


AAATTAATGG 


ATACAGCTGG 


AATAAGATTA 


4620 


ACTCCAAAAG 


AAATCGTATC 


TAAATTAAAT 


GAATACATCG 


TTGGACAAAA 


TGATGCTAAA 


4680 


CGTAAAGTGG 


CAATTGCCCT 


ACGTAATCGA 


TACAGAAGAA 


GTTTATTAGA 


TGAGGAATCA 


4740 


AAGCAAGAAA 


TTTCACCTAA 


AAATATTTTG 


ATGATTGGAC 


CAACCGGCGT 


TGGTAAAACT 


4800 


GAAATTGCAA 


GAAGAATGGC 


CAAAGTTGTC 


GGCGCGCCAT 


TTATAAAAGT 


AGAAGCTACT 


4860 


AAATTTACTG 


AGGTAGGTTA 


TGTAGGACGA 


GATGTTGAAA 


GTATGGTTAG 


AGATCTTGTT 


4920 


GATGTTTCAG 


TAAGATTAGT 


CAAGGCGCAG 


AAAAAATCAT 


TGGTACAAGA 


TGAAGCAACA 


49S0 


GCTAAGGCCA 


ATGAAAAACT 


TGTTAAGTTA 


TTAGTTCCAA 


GTATGAAAAA 


GAAAGCGTCT 


5040 


CAAACGAATA 


ATCCTTTAGA 


GTCACTTTTC 


GGAGGTGCAA 


TTCCAAATTT 


CGGACAAAAT 


5100 


AACGAAGATG 


AAGAAGAACC 


ACCTACTGAG 


GAAATTAAAA 


CAAAACGTTC 


TGAAATTAAG 


5160 


AGACAGCTAG 


AAGAAGGCAA 


ACTTGAAAAA 


GAAAAGGTAA 


GAATTAAAGT 


CGAACAAGAT 


5220 


CCTGGTGCTT 


TAGGTATGCT 


AGGTACAAAT 


CAAAATCAGC 


AAATGCAAGA 


GATGATGAAT 


5280 


CT^TTAATGC 


CTAAAAAGAA 


AGTTGAGCGA 


GAAGTTGCTG 


TTGAGACGGC 


AAGGAAAATC 


5340 


TTAGCTGATA 


GTTATGCGGA 


TGAACTAATT 


GATCAAGAAA 


GCGCTAACCA 


AGAAGCGCTT 


5400 


GAATTAGCAG AACAAATGGG 


TATCATCTTT 


ATAGATGAAA 


TCGACAAAGT 


TGCGACGAAT 


5460 


AATCATAATA 


GTGGTCAAGA 


TGTCTCAAGA 


CAAGGTGTTC 


AAAGAGATAT 


TTTACCTATA 


5520 


CTTGAAGGTA GCGTTATTCA AACCAAATAT 


GGTACTGTGA ATACTGAACA 


TATGCTGTTT 


5580 


ATAGGTGCTG 


GAGCTTTCCA 


TGTATCTAAG 


CCGAGTGACT 


TGATACCAGA 


ATTGCAAGGT 


5640 


CGTTTTCCGA 


TTAGAGTTGA 


ACTTGATAGT 


TTATCGGTAG 


AAGATTTTGT 


AAGAATTTTG 


5700 


ACAGAACCAA 


AATTGTCATT 


AATTAAACAA 


TATGAAGCAT 


TGCTTCAAAC 


AGAAGAAGTT 


5760 


ACTGTAAACT 


TTACCGATGA 


AGCAATTACT 


CGCTTAGCTG 


AGATTGCTTA 


TCAAGTAAAT 


5820 


CAAGATACAG 


ACAACATTGG 


TGCACGTCGA 


CTTCATACAA 


TTTTAGAAAA 


GATGCTAGAA 


5880 


GATTTATCAT 


TCGAAGCACC 


AAGTATGCCG 


AATGCAGTTG 


TAGATATTAC 


CCCACAATAT 


5940 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 606 0 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 612 0 

5 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 6180 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 624 0 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 6300 

w 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 6360 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

J5 GATGATTTTA ATG2LAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 64 80 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 654 0 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 66 0 0 

^° GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 66 6 0 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 67 80 

25 

TTAGAAAAAA GTAAAT 67 96 
(2) INFORMATION FOR SEQ ID NO : 3: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 3: 



ATCCTAAAAT 


TnAAAATTAT 


CACGCCTTTT 


GaACAGCTTT 


GTAACCaTCt 


GGACGATCAT 


60 


kAAATTCCaA 


TGTAAATCCT 


GGTTTAAaGT 


TGATCTTTAA 


CCTTATTTAA AyCACCAATT 


120 


GTACGTATAT 


TATGTTGTTT 


AGCAAAATCA 


CGTTTTACAG 


CTAAAGCATA 


CGTATTGTTA 


180 


TACTTCATTG 


GTTTTAACAT 


AGTCATTTGA 


TATTTCTTTT 


CAAGACTTTG 


CTTAGCTTGT 


240 


TCATAAACTT 


TTTTCTCTTC 


TTTTGACTTC 


AATGGTTCTT 


TTGTTAATTC 


ACCTAAAACT 


300 


GTTCCAGTAA 


ATTCTAAATA 


CCCATCTATA 


TCGTCAGATT 


TTAAAGCATT 


AAATAAAAAT 


360 


GCTGTTTTGC 


CCATACCATC 


TTTCACTTCT 


ACAGTATTTT 


TGGTCTCTTC 


TTCTATTAAA 


420 


ATTTTATACA 


TATTTGTAAT 


AATCGATGGC 


TCGGAGCCAA 


GCTTTCCAGC 


TAACGTAATT 


480 


TTATCACCTT 


TTTGTGCAAA 


CATAGGAATA 


GCGATAGCCA 


GTATAATAAT 


CATCACTATA 


540 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATGCTGCT 72 0 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 84 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 90 0 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 1020 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 1080 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 114 0 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 12 00 

TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 132 0 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 138 0 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 15 0 0 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 156 0 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 1680 

TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 174 0 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 180 0 

AAGTAATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 186 0 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AATATAACCT TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 204 0 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 20 73 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 60 

AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 24 0 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 3 00 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 420 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 4 80 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 54 0 

CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 66 0 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 72 0 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 780 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 84 0 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 960 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1080 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 114 0 

TACCAAACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 12 0 0 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 144 0 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 162 0 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGctAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT 18 60 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 192 0 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 1930 

ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 2160 

AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 222 0 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 22 8 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 234 0 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 24 00 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 2460 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2580 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 264 0 

ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 2760 

TGCnTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 282 0 

AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2S80 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 294 0 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 3180 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 324 0 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3300 

TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 3360 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 2 0 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 34 80 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 360 0 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC C TTTTGGTCC TAATTCTTCT 366 0 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3 72 0 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3 78 0 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

GTAAATGCCT CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCCAGCAACT 3 90 0 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 960 

AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTTCAATC 4 020 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4 080 

GCACCTTCTT TAGCATACGC AATTGCTGCT GCAOSCCCTA TTGCTGAGTC ACCACCTGTG 4140 

ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC GCCACAATCG 4200 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 4 26 0 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 4 32 0 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 43 80 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 444 0 

CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 4500 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4620 

GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 46 80 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 4 74 0 

CTTGTAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4 8 00 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4 860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 92 0 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 504 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 516 0 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 52 20 

TAGCCAAAAT ACTAGCCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 5280 



228 



EP 0 786 519 A2 



CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTOGCA AGTGGTAAGT 54 00 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 54 60 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 552 0 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5580 

TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 5640 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 5700 

AATTGAGACT CTATACAAAA ACXSTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 5760 

GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5880 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 594 0 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 600 0 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 624 0 

AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 6300 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 63 60 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 64 20 

CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 64 8 0 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCTAAATCAT 654 0 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6600 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AATCATAGTA AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 6720 

ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 6780 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 684 0 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6 9 00 

CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6960 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7 02 0 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTATCAATCA TTGGAATATG 7080 
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AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTTCCAAT CTTCTTTCAG CACGCTCTGA 720 0 

TCCATCTGCT ACAACT^CCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 7260 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 732 0 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 73 80 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCXSTC 744 0 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 7500 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 7560 

TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCXSCACGCT CGATATCTTT 762 0 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 76 80 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 774 0 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTC7AG 7800 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7 860 

TTTTTCACGC AATACTTTCG CTTCTTCTAC AGAGTATCCT TGTGGCACAT ATCCATTTAG 7 920 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 7980 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTC 8040 

TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCTGTTTTAA CATCACAGTA 8100 

TTTCGTATCA ATT03CTTAT CAACACGTGT TTCATCAACA TCCACGCAAA TTGCTACCCC 816 0 

ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 8220 

TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CAAATGTCTC 8280 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 834 0 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 84 0 0 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 8460 

AGCAACTGGC TTTCCTGATT GTACTAACAT TGTCTCATCT GATTCTAATT CTCGTAACGT 8520 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTCCAATAC CACCATAAAC 8580 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATTCTAAG 864 0 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8700 

TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8 76 0 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AGCCAACAAA TATAAATAAA 8 820 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 888 0 
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CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA 
GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT 
CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT 
AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA 
GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG 
ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA 
ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA 
ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT 
CCACTCATAT TACCAAACAA AAATTCTCAA GTGCGCAAAC ACTTAGATGA 
AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC 
TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA 
ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT 
TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA 
TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT 
CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT 
CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT 
CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT 
CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT 
CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT 
CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA 
TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC 
TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC 
GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA 
TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA 
TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA 
TTTAGAAGGA AGAGTTAGAG ATATTAGAGA TAGACAATCA ATTTACTTTA 
TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT 
GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT ATAAAAAGGC 
CCSTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA 



TTTAGGTTAT 
ACAGTTTTAT 
GTATGATTTA 
TCAATGGATT 
TTTATATGAA 
TTTAGCTATA 
GGAATCTTAC 
TGAAAATTTG 
CTATTTTAAT 
AGCAGTTGGA 
ATCATTTCAC 
TTATATCAAT 
ATGCCAAGAT 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGTAGTTG 
TATGCAGTTG 
AAATAATTTA 
ATATATGCTT 
TTGGGAAATT 
TTGCTTTAAA 
TAGCTTTCAC 
ACGTGAATAT 
CCGACCCTGA 
ATTATAAAGA 
CTCTTGAACT 
TACAGTTGTT 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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TTACTGCAAT 
CATTCTGCCC 
GATGCACTAA 
TCTCAGCAAT 
GAACAGTATG 
TATCAATGAC 
CAATCATATC 
CAATCTTTTT 
CAAGCACTAA 
ATGTAGATTG 
ATTGTGTTGG 
CACCTAATAA 
CTTTTTGAGT 
TTcACATTGA 
TTTTGCACCA 
ACCGACCCCT 
TTGAAAATGT 
TGGTTCACCT 
AAAGTATAAC 
ACTTATCTAC 
TGTTAATAAA 
TCTCGGCTTG 
GTTACATATA 
CTTCTTTGTC 
TGTCAAACAC 
TATACTCGTT 
GAAAGTAATT 
TATGTACAGC 
AGTACCTATT 



TATTTTTCAA 
TGCAATGCAA 
ATTAGCAACG 
ACTAACTGAA 
TGGATACAAA 
ATCCATACAA 
TTTAATTGTT 
CTCTTTAGCA 
ATAATCTGTG 
TTGTTCAGCA 
ATAGACTTTA 
AAATGTTTGT 
ATCAATTAAT 
TTCAAATCCG 
ATGCGCCCCT 
GATGGCATAC 
CTAAATTTTT 
TGCTTGTACA 
GTATAGGAAA 
ATTACTTTTA 
ATACGTCAAG 
TAATATGTAT 
ATGAAGAACA 
ATTACAATGG 
TTTATATTTG 
TTCAAATATA 
GAACGTCTAT 
TTTAATACAT 
ACACACATTC 



ATATATCAAC 
ATCTCGTCAC 
AGCTTAGCAG 
GACACCTTAT 
CCTAACACTG 
ATCGTAAACA 
GGTGATACGT 
TAATCAAATA 
TTTTCATCTT 
CGTGTATCAA 
CGTGTTGCTA 
CTATGATTAG 
TCCTCATGAT 
GCAAACCTGC 
TGTTTAAAGC 
TACTCTTTTC 
TCGGGTCTGT 
CAGCATTTCC 
TTTTGAATTC 
CCCCTATTTT 
ACTATTACAT 
AATTGTTACT 
ATAAAATTTC 
CGTTATCCAT 
AAGTTACAGA 
AAAAGGTCCA 
TGCTAGCTTT 
TTGCAGATAG 
TGATTGGACA 



GTTAATATAA 
ATATAAATAT 
TTCTATTGTC 
CACTTGGAAr 
CCGGCGCACT 
TAATGACATC 
GACTCAATAA 
AACTTTGCGT 
CTTCTAAAAT 
AATGCGCATC 
AATATTGAGC 
CAATTGACTT 
CATGATAAAC 
AAATGCTTGT 
AACACCTTTG 
CAGCTTAGAC 
TTCACTATCT 
CCCTCTTATT 
AATTCATAGT 
CTATGTAATA 
TTTCATTAAT 
AAAAGATAT7 
TGGTTTTCAA 
TATGCTCAGA 
TCTAGCACCA 
ACTTGCAGGT 
AATTTTACCT 
CTTTATTTTA 
TATTCTGATG 



CTTCTATTAA 
TTTTAATTAT 
AGCGTCATAT 
AATACGTTTT 
TACCCCAGGC 
ATGTTCATGT 
TTCATCTGCA 
ATTACCACCT 
TTGTCTAAAG 
AATATTTATC 
ATACGCAATA 
CGCTGCAAGC 
ATTTCCGTAA 
TTAATCGCAT 
TCAACAGCAT 
AAATCTTCAA 
AACCTTCCAG 
TATGTGGCTT 
TAAATCCGTA 
ACGAATACTT 
ATTGACATAG 
TTGCTTGTTA 
TGGGCAATGA 
GATTTCCAGT 
TTAATTGCTG 
TTAAAATTCT 
TTAATTATTC 
TTACAATCAA 
GCGTTCGTAG 



GAAATACTCA 
TTTAAAAAAT 
GTTGGATTCA 
GCTAATTCAA 
GCAAACGCAC 
ACAAAACGTT 
AAGACATAAT 
TGAGCAATAC 
CTCGTTCCAG 
ACACCAATAG 
TCATGTCCAC 
ATAGCAAATT 
TCGACTAAAG 
CTGGTCCTTC 
AGCCTAATAT 
ATGTTACTGT 
TCCATAAATT 
ATTAACAATT 
TCTTAAAAAT 
AGCTGATTTA 
ACAATTTATC 
CCTAATGGAG 
CGATTTTCGT 
CTATAATTGG 
CAATCATTTG 
CAATCAGCCT 
TAATTATTGG 
CAGGCTTATC 
TAGAATTCGG 



10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12430 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT TAATTAGAGC 12660 

GACTAAAGGA CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 12720 

TATTTTCTTG TTTAGCGAAG AAATCGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 1278 0 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATCCGAG GTATTGCATA 12840 

TTTAACAA.CA AGACGAAACC TTGAAGAACT TGAGCCTAAT AATTATTTAG ACCATGTCAA 12900 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 12 960 

TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13 020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACTGAA AAAACAGAAC CTCAACATCA 13 080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 1314 0 

AGAATCAGCm GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 132 6 0 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13 32 0 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 854 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 6 0 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 12 0 

ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 3 00 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 4 80 

AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 660 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 720 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 78 0 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 84 0 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 900 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 96 0 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 1020 

TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 108 0 

TTGTCAAATT CTAAAATTAT AGCCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 114 0 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT TAAATTCATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 1320 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 138 0 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAGCATAA 144 0 

TACGATATCC CACCAGATAA AATAGATGaT ATTATCGGTA TGTATATATC ACCTTTCATA 1500 

TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 1560 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 1680 

GGGACCAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 174 0 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACCCAGTGAC 186 0 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 192 0 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2040 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 2150 

CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 24 6 0 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2 520 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2 580 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 264 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2700 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAATCTTTAT 2820 

TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2880 

CACCATATAA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 294 0 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 306 0 

TTGCTTGCTT AATAACGCTT TTAGCTTTAT CTCCAACACT TACTTTATCT GGGAAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 324 0 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 360 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 34 80 

GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 354 0 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 36 00 

ATAITCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 366 0 

TTTTCAGCTT TATATTTCTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3 720 

TTATCTTTAT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3780 

TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 840 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3 900 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TTACACTTAC TAAACCAGAA 3960 

ACACGACCAA AAGCTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 020 

CCAAAAATAG TTTTTAACAA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 40 8 0 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 414 0 
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TGCXJTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 4260 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 4 32 0 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4 3 80 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 4 440 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4 500 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4 560 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 462 0 

TGTACTTTGA ACAACTTGTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4 680 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4 74 0 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4 8 00 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4 86 0 

AGCGTATGAA ATTATCTCAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 492 0 

ATAATTTTGT TAATTGTCCC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 98 0 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 504 0 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 522 0 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 8 0 

TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 534 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 54 00 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 60 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC 5520 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 570 0 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 576 0 

AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 5820 

ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5 880 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC GCACGATCCA TAATAGTTTT TTCTAATTCC 594 0 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 618 0 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6240 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6 3 00 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 636 0 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 64 20 

TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 6480 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 654 0 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 6660 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 67 8 0 

CX3CCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6 840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG €90 0 

TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6960 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 7020 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 708 0 

TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 714 0 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 7200 

TCTCCTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 726 0 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 73 8 0 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 74 4 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 75 00 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7 56 0 

TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 76 80 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CATATAGCCA 774 0 
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ATTACTGCAT 


TTGTAAgAGG 


TGCAAGTTCT 


GTCACAAATA 


AAAATTCTTG 


CTTATCAGGT 


7860 


TCAAAACGAT 


ACTCGATATC 


AAGAATTTCT 


TGTTTGGTCT 


TATTTAATTC 


TCTTATAGTT 


7920 


TCCTCTTTAT 


TAATTTGAGT 


TTTGGTTTCC 


CAATCGTCTA 


AATGTTCTTT 


TAATGTGTCA 


7980 


AAGGTTTCGC 


CGTTTACATT 


AACTCGAGCT 


TGAACAATCT 


CATTAGCACT 


GTTATTACGT 


8040 


GGTGCCACAA 


CAAGTGCGTT 


AATTTGACTT 


TGTAAAGATT 


TGTTTACTGC 


TGCTTGCGAT 


8100 


CTACCATTAT 


AATAAATTTG 


CTCAGCGAAG 


TGTTGAATTG TTTTAGCTyT CTGATGCAAC 


8160 


TTAAACTCTG 


TTGTCAAGCC 


AAGCGCAAAT 


TGCTCTATTC 


TTTGTAAGTT 


TTGTATTTCC 


8220 


TTAGCTCTAT 


AATCTCGACC 


TGCTAAAGCT 


CCCAAATCCT 


TTATTAAATA 


CAAATTTTCC 


8280 


ATAATGCACC 


TTCCTTTCTA 


ATAAAATAGC 


ACTGTACCAA 


GTTTCCCACT 


ATCGTCAACT 


8340 


GTTATTTTCC 


ACAATTTACC 


GTTTGGGGAT 


TTCTGTACAA TGCTATTTTG AATAATTgcC 


8400 


TGctTCGCCT 


ATTTTTAAAT 


TATCTAATTT 


ATTTk.TATCA 


TTTACCGAAA 


TGATACCGTC 


8460 


TTGAGGCAAT 


CCATCAATAn 


CACTACTGCC 


TGCATAAGGT 


ATCCCATTTA 


TAGCTTTCCA 


8520 


ATGTGTAGCT 


GGAAAGTACT 


GTTTATCGT 








8549 
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(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGACCAAAA 6 0 

AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 120 

40 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 

GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 24 0 

4S AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 300 

AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 360 

ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 420 

TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 480 

GGATACTGCG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 54 0 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 600 

55 



EP 0 786 519 A2 



TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 7 20 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 780 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 84 0 

CATGATTACG GATAATATTC CAGArcCAGT TGAAAaTwCm GATGCTATAT ATnr,CAGATGT 90 0 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 96 0 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 102 0 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 1080 

TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 1260 

CCTCATATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 8 0 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 14 4 0 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 15 00 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 

AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 168 0 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 18 00 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCA(^GGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 19 80 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTATC 204 0 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCGCAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG ATCTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 2220 

GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 22 80 

AATAAAAATG GAGCACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 234 0 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 24 00 
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CGACAGCAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGCCTCAAT TTATGCTATA 252 0 

TGGCTTATAT ATGCAGCAGG TATCAATTAC TTATTATTGA CGATGTTACT TTATATTCCA 2 58 0 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 2 64 0 

TATATTCtTT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 27 00 

GGAACGATAA ATGTTTTTTA AAAGGAGCGA CAAAAATATG AAAGAGAAAA TTGTCATTGC 276 0 

ATTAGGCGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACT^CTAT 2820 

TAGATGTGCX3 ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 2880 

ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAGCTAAAT CGAACAGTGA 294 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3000 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3 060 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 312 0 

AcCAaTTGGT CCTTTTTATA CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 318 0 

CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTGcGT CACCACTACC 324 0 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 3300 

TGCATGCGGT GGTGGCGGTA TTCCAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3360 

AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 3420 

CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 3480 

ACAAATCGAT GATATTGATG TAGCAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 354 0 

GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAGtG GGGaAACCAA 3600 

A 3601 
(2) .'information for SEQ id NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 60 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 12 0 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 36 0 

^ GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

to 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

,3 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

25 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 12 0 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 18 0 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 24 0 

30 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 360 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 420 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACSrACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 54 0 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA PJ'JATTPACT 66 0 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 72 0 

45 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

50 ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 96 0 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 

55 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 114 0 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUKNCE DESCRIPTION: SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 18 0 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 240 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 300 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 360 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 420 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 540 

TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 600 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 72 0 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 7 30 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 

AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 

CAATAAGAAA 1090 
(2) INFORMATION FOR SEQ ID NO : 10: 
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(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
!D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 6 0 

ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 12 0 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 18 0 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 24 0 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 3 00 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 360 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 42 0 

AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 480 

AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 600 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 660 

AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAGCATC 720 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 78 0 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

TTATGAAGAA GGGAtCaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 900 

GACG 904 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 11: 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 6 0 

AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 120 

TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATGCCTT 3 SO 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 42 0 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 4 80 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 540 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 600 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 660 

GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 72 0 

CTGTTACATC ATGTTGTTTT TTAAATGCTG TTAAAGCTTG GCATAACATT GAAAATTCAT 780 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 

ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT 900 

AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 960 

GCAAATGACT TAGTGCTTTG TTCATACCAA TATTATTGAA GCCCATTCGA TTTATCAAGG 1020 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 1140 

AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 1200 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG 1260 

GaAACTTTTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 1320 

TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 1380 

GAGGCTTACT ATCCTCAACT TAATATATGT GAAATATATT CTTTTAATAG ACTAGCATTT 1440 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 1500 

CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AGCATTTACC TTTAAAAGAT ATTAAAGAGA 1620 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGTCGAAACA GATACGGATA 1680 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA GAAATTGCAG 1740 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 1300 

TTAACGGTGA TGCGTATTTA GTGATGACGT ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 1360 

GCCAATTAGG GCAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

GCTTCTCATT ACCTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 1980 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 22 2 0 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 2340 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 2400 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 2460 

ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2520 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2640 

AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 27 GO 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATmTA AGACCAATAG 2820 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 2880 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 2940 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 3060 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 3180 

GATGCGCTTA ATCATTCGAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 324 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 3300 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 336 0 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 342 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 34 80 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3 540 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 366 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 372 0 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3780 
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ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3960 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4 0 20 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGGCTA 4 08 0 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 414 0 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4 200 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4260 

CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4320 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 4380 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4440 

TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4 500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCcLPiT 4 56 0 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 4 62 0 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 4 68 0 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 4 740 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4800 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4860 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4 920 

TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4980 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 5040 

TACCTGCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 

ATTATCAAAA AAATCTWIATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 5280 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGC7VGAA AGTGTAGTTT CTTcGCCTAT 5340 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 5400 

ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAGCATGGAT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 5580 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5700 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 5760 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 58 2 0 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 588 0 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5 940 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6 000 

CTTAGCGTTC CX3CTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 6060 

CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 6300 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC S360 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 6420 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 8 0 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6540 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6 600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 6660 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 672 0 

GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 6780 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6340 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 690 0 

TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 69 6 0 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7 080 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 7140 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 

CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 72 60 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 732 0 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 73 80 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7 500 

GAGAATGGTT AACATGCCTT CATGTATAAT AACGAGTTGA TTTGAACGTT TAAGCGTAAA 7560 

TAAAAATAAG CTTGGTCAGC CATCAAATAT AATTTGAAAA CTGTCCAAGC TGTTTTATTA 7620 

GAGAACAATC AATTAACCCC ACATATTTAA TAATACATCA GCAAAGCCTT CAGGTTTTTG 76 8 0 

AATATAACCT AAGTGACCGC CTGGAATATC TACAATAGGT ATGCCAGTTT CTTTATTTAT 774 0 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC TCTAGAATCT GTCCCATTTA GTAGGGTGAT 7800 

TTTATCGCTG TATTTTGTGA AATCATCCAA AGTAATATCT GAATGCGTAT ATTGTCTAAT 7860 

TTCAAATTCT GACCAGAACA TCGTACGTTT GTACTGTTCT ATACGTCCTT CTTCAGTATC 792 0 

AGCAGGTTGA GACATCATTT TTGCATCAAT TGGTGCGATA TTTAATGTTT CGCCAAATGT 7980 

TTTCATGCCT TTTTCTAAGC CTTCTGTTAA AATTTGATGC ACAATGTCAT CATTTTTATC 804 0 

TTTCCAATAA GTACTGTCTG GTAAAAATGT ATTAATTGGT GGTTCGTGAA ATGCAATCTT 8100 

TTTAACGACT TCAGGGTAAT CTTTTAACAC ATGCATCGCA ACGATTGAAC CTGAACTTGA 8160 

ACCTAATATA TAGACAGGTT CATCACTTAA TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 82 2 0 

GTCGCGTTTG ACACGATAAT CACTGTCAGG GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 82 80 

TAACTCGCTT TCTCCATAAT CACGACGATC AACGGCTACA ACAGTAAAAT GGTCTTTTAA 834 0 

CTGTTCTGCA AGAGGCAGAA AAATGTCTCC GGTACCGTTT GCACCAGGAA TAAAGATGAG 84 00 

CACGGGTCCT TGTCCGACTT GGTGGTATCG TAATTTAGCG CCTTGTAATT CTAAAGTTTC 8460 

CATATTCAAT GACCTCCATT TGTTAATTGT TAGGTGATAA ACCTAATAAT TTAGCACCAT 8520 

TTGTATAACT TATTTTCTCT TTTTCTTCAT CTGTTAAACC CAGTTCATCT AAAAATACAC 8580 

CTAATTTTTC AGGCTCAATA TATGGATAAT CAGCAGCATA AAGAATTCTA TCAATACCTA 8640 

CTTCTTTCTT GACTAAATCA AACTGTGGCT TCGTTAACAT GCCACTCGGT GTGATATAAA 8700 

AATTATTTTT AAAGTAATAG CTTACAGGGT GGTTCAAATG TTCAGCGAAT AAAGCTTCAT 8760 

CCATACGTTC TAAGAAGAAT GGGATAAACT CACCCCAATG TCCAATAATC ATATTTAACT 8 82 0 

TTGGATAACG ATCAAAAATA CCAGATAATA CTAGATGTAT TGTATGAATG CCGACATCAA 8 880 

TGTGCCAACC ATAACCAAAA CAAGCAAATG TTGCCGCAGT TACTTCAGGA TAATTTCCTT 8940 

TATAGTATGA TTGATAAATG TCACTGTTAA CTGGCGCGGG ATGTAGATAA ATCGGTACGT 9000 

CTAAATTTTC AGCTGTTTTG AAAATAATGT CATATTTGTC TTGATCAAGA AAACCATCTT 9060 

GTGCACGTCC CATAATGAGC GCACCTTTGA ATCCTAAATC ATTGATGCAA CGTTCGAATT 9120 

CTCGCGCTGC GGCTTCAGGC TCATTGATAG GTAAAGTTGC AAAGCCTACA AAGCGATTGG 9180 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 93 00 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATTCGTCG GCATTTGTAA 93 60 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 94 20 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 94 8 0 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 954 0 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 960 0 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTGCTAAT 966 0 

TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 972 0 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 9780 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 9840 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 9900 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 996 0 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACX3AGG TGTCAATTTG 10020 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 10140 

TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 10200 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATGGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 105 00 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 1056 0 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 10860 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 

GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 1122 0 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 



(2) INFORMATION FOR SEQ ID NO: 12: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
,5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 60 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 12 0 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGGACCGT TACGTTGCTT AGCAATGATA 18 0 

25 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 24 0 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 3 00 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 360 

30 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 4 20 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTCCTGTT GTCTGTTATC GGACGCACGT 48 0 

35 GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 54 0 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAA3UUVTCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 660 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 72 0 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 780 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

CGCGCTGCAA GGATAATTAA ATCATTTCGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 96 0 

TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 102 0 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 1080 

TTATATCCAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 114 0 
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TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA 
ATCAATGTTA CAACATCAAT TTCTTTATTA 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC 
ATCAATTCTG GATCTATAAT AATTGAACCT 
GGCATTTGAT TTTGCTCATA CATTCTATCC 

w 

ACTTTATTGT TCAACTGTGT GTACGCGAAT 
AGGTACATTC GTATATCCTA GGGAATGAAT 

15 AATTTTAATA TCATGTTGTG CTTTTAGTGC 
AAACAATTTA CCACCTTCAC CAGTTTTTGC 
TTCTTTTAAT GCTTTAgCAT CTTCAATTTC 

^° CTGTAACTCT AATTGTTTAA GGTTACCTGG 
TAAGAAGTTA TTTGCATAAC CTACTGGTAC 
TTTACCTTTA ACATCTTGTG TAAAAATTAC 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC 
GTTGCCGCAT TGGTTAAATG TCCACCGCCA 
ACTGAACCGA GTGAACGCGC AGATATACCA 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC 
GGATGATAAA TTTTATCGTC TGAACCATGC 
35 ACAGTTCGAA TTAATTCAGA TCGATTAATG 
TGCGTTAAAA TCGTATCTGC ACCATGTGCA 
GATCCTGTTC GTAATGTAAA GTTTCTTGTA 
GATTCAAGAC GTGTTAAACG TTGTTCTGTT 
TCAGCTGTCG AACTTGCGTA TGGTTCCATA 
TCACCACGTC TATGATGATC GATAACAACT 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA 
ATATCCCAAG CATCATCTGA TGTAATAAAT 
50 TCGTTCATCA CGCGTCGTAA TGTTGGATCA 
TCTAAATTAT TCATCATTGC AAATCTAGAC 
TCAGGACGTT TATGTCCCAT GATAATGACT 
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TTCAACGTAC CTTCCGTCGA TAATTGATCC 12 6 0 

TCTTCATTTA AGTGCATCAT TGCACGGAAA 132 0 

TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

AAGACAGACT GTTCAGCTTC ATTGTTATGC 14 4 0 

ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

TGTACCTTCA ACTT C TTTAT CTAATTTAAC 15 SO 

TCCATTTGGT AAATCCATTT TACGTTTATC 1S2 0 

TTCGGCAATT TGTTTTGTAC TTACTGACCC IS 80 

TGaTACTTCA ACTTCAATGT TTGATAACGT 1740 

TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

TGTTGCTTCT ACAGCATAAT TCTTTTTCAA IBSO 

TTCTTTAACT TCACCTTTTT TACCTTTACC 19 20 

TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

GCCTCTTCGA CTGTCACACC TTTAAGTTGT 204 0 

CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ATCAGATTAT CTTCACGTCT CGCT^CAACA 2160 

AGTTCATCTG CTGCTTGTGC AACTGTTACT 222 0 

GCAATGGCTA TGCCATTATC TTCAACTTTT 2280 

TAAGTATCCA CATCATCTTT TAAGAAATGT 234 0 

CGTAAATAAC TCGCTGCATC GAATGTTCTT 24 00 

TCTACAATAA TACCTGCATA CATCACTGTT 24 6 0 

GGTTGATATT CCAGTAACTC TGTTACCAAT 2 52 0 

TATATCAACA ATGGATTAGA GATGAAGCTT 258 0 

TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2 760 

ATGTCAGTCT CATTTAATAC GATGTATGCT 2 820 

ACACCGATTG CTGCACCAAT TGCATCTAAG 2 880 

TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2940 
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CCATAGAAAC GCACATTACC ATTAATACTT TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3060 

AATGCTAAGT CTAGGCCTGA TTGTGATAAT TCACCTAAGT CGATTAAATT TTCAGTACCT 312 0 

^ TCACCAACAC CGATACTTAA TGTTAATTGG GCACGATAAC CAACACTTTT TTCACGTAAT 318 0 

TGACTCAAGA TATCAAATTT AGATTCTTCT AAGTCAGCTA ATATTTTTTG ATTTAAATAG 3 24 0 

GCTACGAATT GATCGGAACT GTATCTTTTG AAAAATATAT TATACTCAGT TGCCCATCGA 33 00 

w 

CTAATGACAC GCGTTACCAT TGAGTTGATT TCCGAACGCT GCGTATCATT CATATTTTGC 3 3 SO 

GTAATCTCAT CGTAGTTATC TAAAAATAAT GTCGCAATGA TTGGTTTAGA ATTTTCATAT 342 0 

,5 AGTTCATTTG TTTGTACTTG TTCAGTTATA TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 8 0 

GAATAACGTA CTTGGAAATG ATACTGATTA TATTCTATTT cAACGGATTT CACTCTATCT 3540 

AATTGCTTTA AAATGTTTGG AAATACTTCA TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

so ATATGATCTG TCATAAATTG GTTAACCCAT TCGATGTGAT CATTTTCATC TAAAACAATG 3660 

ATACCAATTG GTAAATGTTT GATTGCTTTA TTATTTGTTG TTGAAATTTG AGCACTCAAA 3 720 

CCATCTACAT AACTATCCAT TTTCATTAAA GCTTGTCTGA ATAAAATGAT GCTAACAATA 378 0 

ATCATCACGA CAAGAACGAT AGATGCAATT AGTGCTATAA GACTATTAAA GATAAACCAT 384 0 

ACACCCATTA AAACAATTGC TGTGATGATC ATGATGACAA ATGGTATTAG TAAAGCTTTC 3 900 

TTAGTGGACT GCCGATTCAT TATTCCACCT CTATTCACTT TTTAGAATTA TTTTTCATGA 3 960 

30 

TTCGCTTCAA ATTCAAACTT AAATCGATAA CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

GTGTCAGTAT TGTACCGATA ACCAATAGTA AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 080 

CTTTACCAAA GAAATGAATA ACACTTAAAC CTTGAATATA CATTACTAAT GATAACACAA 4140 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA CACTCGGTTG ACCTGTAAAT AATAAACATA 42 00 

TGATAACAAT AATGTATATC CATAATAAAA TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 320 

TTAAGACGAT TAAAAATGTA ATGATAATGA TGAAACCTGG TAATTGAACG GTCGCTTGTC 43 80 

TAAACCCTTC TTCTAATATT TGGGTCATAT TCGCATCGGC ACCGCTCATC GTAATCGCTT 444 0 

45 

CATGTAATGT TTGCTTGAAA GGTTTTACTA TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA TTnArCTCAT CGCTACTGTT GTTACGTATA 4 560 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA GCAATTGACC AATAATTAAA CTTGCAATTA 462 0 

50 

AGACTAATAT GATGGCACTT AAAACGAAAG TATTACCTAA AACAGTTGTT ATAATTACTG 468 0 

TAATAAGTGC ACTAATCCCG AAAGATTGTA TTGATTTATT CCATAAAACG ATACCTGGTA 4 740 

55 
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CAAATACCAA C2CAATCGTT GCAATTATTG TTGCTTTAGG TTGTATTTTT GAAAACACAT 4 86 0 

AAGCCACTCC CATATTTTTA ACTATAGCTA TTATTTTAAC CTCTTTAATG AAAATTAACA 4 920 

ATTTATAGAT TGTATGCTTC TATTTCATTT AATTGAATAA TAACTTTCAT GTTTTATAAG 4980 

TAATTAACAT ACTCATTTGA ATCGCTTTTG TGTGCTTTCA TTTTCAACAT GATTATTTAA 504 0 

TCCCACTACA TAGCAATCAA GCTTGATTTA GATTTACAAT ACATTTCCAC TCTCATGTAC 510 0 

TCTAGATGTT TTTGAATATG ATAACTGTGA TTTAGTGGCT TCATTCTTTG AAAATATATA 5160 

TTATTACTTA CGCTTAAAAT GCTTTAAATT TAAGAAATGA TATAAGTTAG GTGCCCAGGT 522 0 

ACTAAAGTTT AGTAGGaATC CATCATGCCC AACATTATCA GGCACGAAGA AATGACGATG 5280 

ATATTTAAAA CGTTCACCTA ATGCACGAAC TTGATCATCC GGATATAGCA AATCATCTAT 534 0 

GAACCCCATC GTTAACACTT TTGTTTCTAA ATTTTTAAAA ACATGCGTTA CGTCTGTGCG 54 00 

ACCTCGGTCA ATGTTGTGAC TATCCAATAC ATCTAGCAGT GTCAGATAAC AATTCAAATC 54 60 

AAAATGTTCT TTAAATTTAT TACCTTGATG TTGTTGGTAT GCGACTACTT CATCCGGCGT 5520 

AAAACGTTCA TCATAACTTT TTGATGATCG ATATGTCAAA AAACCTAATT GGCGTGCAAT 5580 

ACTTAGACCT TCCTTACCAC CAAGATGAAT GGCTTGCCTT GCAATTTCAT TGAAAGCTCT 564 0 

ACTATAAGAT GATGTTCGAC TTGTTGCAGC AAGGATAATG GCTTTATCTA CTTCAAACTG 570 0 

TTGATTGTAG AGTAGTTCCA TTGCTTGCAT ACCTCCAAGA CTTCCCCCTA TTAAAATATT 5760 

AATCTTATCA TAACCAAGGG CTTGTATACC TCGTTCATTC GCTCTGACTA TATCTCTTAA 582 0 

TGTTAATTTT TTAGGAAAAT GAGGGTCGTT TAAAGGTGAA CTTGAACCGA AAGGACTACC 5880 

AATAACATCA AATGTTAAAA ATTGATAATC GTGAATGGGT ATATATCCCC CATCAATAAT 594 0 

TTCTCGCCAC CAACCCGGAT AATCATCTGT TCCATATGTT AAATGATTGC CAGTTAATGC 6000 

ATGACAAACT ACAACTAATG GTTGTCCATG ATAACCGACA TGCTCATATC TCAAACGCAA 6060 

QTnATCTATG ACTTCCCCAG ATTCTGTAAT AAATTCCCCT AAATTTAAAG TATCTACTGT 612 0 

GTAATTTGTC ATTGTTCTTT CCTCCTTAAA CAAAAAAACT TCTCACCCTA TTGAAAAGTA 6180 

AGAAGTCTTT ATACTTATCA TTCGAGTAAC TCGTTGGTTT TAGCACCGTG CTATAAAGTC 6240 

GGTTGCTGAA GTATCACAGG G 6261 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGCGATTAA CTCTGGAAAT ATCTTTTCCA TATTTACGTn TTAAATTATT CAGCAAATTC 60 

ATACGAGaTT CATACTCGTT yAACACTTGT TCGTCGAATT CTGTATTAGC CATTTCATCA 120 

TATAACTCAT GTTTTGCATC TTCTAAAATG TAGTAAAATT GATCAATATC TTCTTTTAAT 180 

TTGTCATATT TGTTTGGAAC TATATCGTTT ATTGTTAACA AATGGTTGCT TAGTTCATAT 24 0 

AAACGATCAG TGATAGCATT TTCATCCGTT AATGTCATAT ATGCGTTATT AAGCGCTAAG 3 00 

CTTAATTTTT CAGAGTTTTG AATGCGTTTA ATATCTATTT CAAGTTGCTC TATTTCGCCT 360 

TCTTTTAGAT GTGCTTCAGA CAATTCTTCT AATTGGAATT TCATTAAATC TAAACGCTGT 42 0 

AGCAATGCTT GGTCTGCTGA TTCTAAATCT TCTAACTCTT GCTTTTTGGC TTTATAATTT 480 

TGAAAAGTTT GGTGATATTT ATCCAACAAA TCTTGATAAC GTGATTCTGC GTAATTATCC 540 

AATAATGTTA AATGGTATTT TTGTTTCAAC AAAGACTGCG TTTCATGTTG GCCATGAATA 600 

TCTAATAATT CTTGCATAAC TTTTCGTAAA TCTTGTAAAG TAACTGTTTG ATTATTAATT 660 

TTACAAAGAC TTTTACCAGA GCTGAAAATT TCCCGTTTAA CTAATAAAAA ATCTTCATCT 720 

ACATCAATAT CCATATTTTT CAATATATGT ATAGCATCTT TACTCTCGTC AATATCAAAT 780 

ATACCTTCGA TGACAGCCTT TTTTTCACCA TGTCTTACAA AATCAGATGA AGCTCTCATT 84 0 

CCAATTAATT GTCCAATTGC ATCTATAATA ATTGACTTAC CTGAACCCGT TTCACCACTT 900 

AAAACAGTTA AACCATCAGA AAATTGAATT TCTAATTCTT CAATAATAGC AAATTGCTTG 960 

ATTGATAAGG TTTGTAACAT AAACTCATCG CATCCTTATA ACAAATTGAA AATTCTTGAC 1020 

TTGATTTCAT CACTTGCCTC TTTGCTTCGA CAAATAATTA AACAAGTATC ATCACCACAA 1080 

ATTGTGCCTA GTACTTCTTC CCAATTGATT TGGTCTAATA TAGCTCCAAT AGATTGTGCA 1140 

TTACeAGGTA TGTTTTTAGA ACAAGTAAAT TATCAGTACC ATCTATATTA ACAAAGGAAT 1200 

CCATTAAATA ACGTCCCAAT TT 1222 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 60 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 18 0 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 24 0 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 3 00 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 3 50 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 4 20 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 540 

CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 780 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 96 0 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AGCTAAAATG CCACCATGAA CAGCCGGATG 102 0 

T 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 24 0 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 300 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 3 60 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA AtatATCyAA gTAtCGaCtC tATCGTTCCG 54 0 

TGTTGAACAT GATTCGCAAC TTCTTCTCTA GACTCTGCTA ATGTCCCtAT AACTATTTCT 6 00 

GCATCTTCTT CTGCATCTAT AATATACCAA CATTCAGATT TGCCATATTG CCCgTTTTCA 660 

TGCTCATAAG CATAAGAATT ATCAGGGTGC ACATGAATAG AAAGTGATTC TCTTGCATCC 720 

ACTATTTTAG TTAGAAGCGG AAAATCTTTG CTTGGGAAAT CACCAAACAA TTCACGATGT 780 

TCTGACCAAA TACGGTCTAA TGTTTGACCT TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

CCATTTGGAT GTGCTGACAC ACACCAACAT TCCCCCAGTT GTATCATTGT CTAATTGATA 90 0 

TCCAAACTCA CTTAGACGTT GACCGCCCCA TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

TAATGGCATT GTTGCACCTC CATTGTGATT AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 1020 

TCCATTATAT TTTGATTTTG TTCTCATTTA CATOSTATTA TTAACTTCCA CATTTCAAAT 10 80 

TAACTATTAG TGATTGTACC ATATTTACTA ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0 

CACTTAAATT TACAGTACTT TAACATTTTC AAAAATTTAT AGCATAGAGA TTATATCTCT 12 00 

CTTACATTTG TACATATTTC CCTTTAAATT TACTCGCCCA TTATACCAAT TAATAaACAA 1260 

CTTTAATAGT TGTGCCATAC ATTGTTCAAA TTCTTTGTAA AACGCATAGA CAATACGTAC 132 0 

TTATTCATAC TTATAATTCA TCATTTTCAA AAAATAACGA GTTACGAAAA AGTAACCCGC 13 80 

TTCAAATCAT ATTTACTATC CTTATTAATC CGTTTCATTT TCAAATTGAG TTAAAGCATC 144 0 

TTTAATGTCC TGATCACCAC TAATAATTTG AAACTCTTGG TGATTAAAAT GATTGGATGT 1500 

GACAATTTCT TTTAATACTG TCGCAACATC TTCTCTAGGA ATTTCACCTT TACCATCAAA 1560 

ATATTGTGCA GCTTCTATCT TTCCAGATCC TGCTGCATTT GTAAGTGCCC CTGGATGTAA 1620 

AATTGTATAA TTCAAACCTG nAACGTCTTA AATAGTCATC AGCGTAATGT TTAGCTATTG 1680 

TATATGGCTT TAAATCACCG CTATCATCAA AAGCCTGACG TCTCGAATCA TATGTTGAAA 174 0 

CCATGACATA GTGTTTAATA TTGGCCTCTT TACTCGCAAT CATTGATTTA ACAGCACCAT 1800 

CTAAATCGAC AATAATTGTT TTATCTGCAC CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 1860 

TAACTTTATC GAATGGTTTA AACGTCTCAG TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

CAACAAGAAT TGCTTTCATA CCTTGTGATT TTAACGCATT AAGTTGATCT GATTGCCTAA 1980 

CACCAGCAGT AAATGGTACA TTTTCTTTTG CTAATTGTTG CACTAGTAAC GAACCTACAC 204 0 

CGCCATTAGC ACCTATAACC AAAATATTCA TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

ATGCCATACC ACTTTATGAG ATATGTAAAA CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

ACTACTGGGA ACGTATTAAA TTAATATATG AACAAATATT CATATGAAAG GATTGTCATA 2220 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 234 0 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 24 0 0 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 24 6 0 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2 52 0 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 2 5 80 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 2 64 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 2820 

CGTTATTTAA CGGTGTAACG C TTTTT G TAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 294 0 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 306 0 

CCATTACTGC AGCTAJcTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 3240 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 33 00 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 3420 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CX3CATGTACA TTAATAATTT TAACCTACTG 354 0 

CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3720 

ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3759 
(2) INFORMATION FOR SEQ ID NO: 16: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRRNDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: 


SEQ ID NO: 


16: 








TAATTATCGC GCATAACAAA ACATTAGCAG 


GACAATTATA 


TAGTGAGTTT 


AAAGAATTTT 


60 




TTCCTGAAAA CAGGGTGGAA TACTTTGTAA 


GTtACTATGA 


TTATTATCAn 


CCAGAGGCAT 


120 




ACGTACCGTC TACTGACACT TTTATTGAAA 


nAGATGCCTC 


AATCAnTGAT 


GAAATTGATC 


180 




AACTACGACA TTCTGCTACA AGTGCATTAT 


TTGAACGCGA 


TGATGTAATT 


ATTATTGCTA 


240 


10 


GTGTAAGTTG TATATATGGT TTAGGTAATC 


CTGAAGAATA 


TAAAGATTTA 


GTAGTAAGTG 


300 




TTCGAGTTGG TATGGAAATG GATAGAAGTG 


AATTACTTAG 


AAAACTTGTc 


AGATGTGCAA 


360 


15 


TATACACGAA ATGACATCgA TTTcCAACGA 


GGAACGTTTC 


GAGTGCGTGG 


TGATGTAGTG 


420 




GAAATATTCC CAGCCTCTAA AGAAGAACTT 


TGTATAAGGG 


TTGAGTTTTT 


CGGCGATGAG 


480 




ATTGACCGTA TCCGAGAAGT TAACTACCTA 


ACAGGTGAAG 


TGTTGAAAGA 


AAGAGAACAT 


540 


20 


TTTGCGATAT TCCCAGCTTC TCACTTCGTA 


ACACGTGAAG 


AAAAGTTGAA 


AGTTGCGATT 


600 




GAACGTATTG AAAAAGAATT GGAAGAACGA 


TTGAAAGAAT 


TACGAGATGA 


GAATAAATTA 


660 




CTAGAAGCGC AAAGGTTAGA ACAGCGTACC 


AACTATGATT 


TAGAAATGAT 


GCGAGAGATG 


720 


25 


GGATTCTGTT CAGGAATTGA AAACTATTCC 


GTACATTTAA 


CTTTGCGACC 


ACTGGGTTCG 


780 




ACACCATATA CTTTATTGGA TTACTTTGGC 


GATGATTGGT 


TAGTAATGAT 


TGATGAATCA 


840 




CATGTGACAT TACCGCAAGT TCGAGGCATG 


TATAACGGAG 


ACAGAGCGCG 


TAAACAAGTT 


900 




TTGGTGGATC ATGGGTTTAG ATTACCGAGT 


GCATTAGATA 


ACCGTCCACT 


TAAATTTGAA 


960 




GAATTTGAAG rrAAAGACAAA ACAACTTGTG 


TATGTATCTG 


CAACGCCTGG 


ACCATACGAA 


1020 




ATTGAACATA CGGATAAGAT GGTTGAACAA 


ATTATTCGTC 


CTACTGGTTT 


ACTGGATCCT 


1080 


35 


AAGATTGAGG TTAGACCTAC TGAAAATCAA 


ATTGACGATT 


TATTAAGTGA 


AATTCAAACA 


1140 




AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 


1200 


40 


aACCACATAC ATGAAAGAaG CGGGTATTAA 


aGTtAATTAT 


CTGCATTCAG 


AAATCAAGAC 


1260 




ATTAGAACGA ATTGAAATAA TTAGAGACTT 


ACGAATGGGT 


ACATATGATG 


TTATCGTAGG 


1320 




TATTAATTTA TTAAGAGAGG GTATTGATAT 


ACCAGAAGTT 


TCTCTAGTTG 


TCATATTAGA 


1380 


45 


TGCAGATAAA GAAGGGTTTT TACGTTCTAA 


CCGCTCATTA 


ATTCAAaCAA 


TAGGTAGAgC 


1440 




TGCGCGTAAC GATAAaGGTG AAGTCATTAT 


GTATGCCGAT AAAATGACTG 


ATTCGATGAA 


1500 




GTATGCAATT GATGAGACAC AACGTCGTCG 


AGAAATACAG 


ATGAAACATA 


ATGAAAAACA 


1560 


SO 


TGGTATTACA CCTAAAACAA TTAATAAAAA 


AATACATGAT 


TTAATTAGTG 


CTACTGTTGA 


1620 




AAATGACGAA AATAATGACA AAGCACAAAC 


TGTGATACCT 


AAGAAGATGA 


CGAAAAAAGA 


1680 



55 



258 



EP 0 786 519 A2 





TTTCGAGAAA GCTACAGAAT 


TAAGAGATAT 


GTTATTTGAA TTAAAAGCAG 


AAGGGTGACA 


1800 




AGTAAATGAA AGAACCATCC 


ATAGTAGTAA 


AAGGTGCTCG TGCGCATAAC 


TTGAAAGATA 


1860 




TTGATATCGA ACTACCTAAA 


AaTAAATTAA 


TTGTTATGAC AGGTTTATCT 


GGGTCAGGTA 


1920 




AATCGTCATT AGCATTCGAT 


ACTATATATG 


CTGAAGGACA ACGACGTTAT 


GTTGAATCAT 


1980 


10 


TAAGTGCCTA TGCGCGTCAA 


TTTTTAGGCC 


AAATGGACAA ACCAGATGTT 


GATACAATTG 


2040 


AAGGATTATC GCCAGCAATT 


TCAATAGATC 


AAAAAACAAC AAGTAAAAAT 


CCAAGATCAA 


2100 




CTGTAGCAAC AGTAACAGAA 


ATATATGATT 


ATATACGTTT GTTATATGCA 


CGTGTTGGTA 


2160 


15 


AACCTTACTG TCCAAATCAC 


AATATAGAAA 


TTGAATCGCA AACAGTACAA 


CAAATGGTTG 


2220 




ACCGCATTAT GGAATTAGAG 


GCACGTACAA 


AGATTCAATT ATTAGCACCT 


GTCATCGCTC 


2280 




ATCGTAAAGG TAGTCATGAA AAGCTAATCG AAGATATTGG TAAAAAAGGT TATGTACGTT 


2340 


20 


TAAGAATCGA TGGCGAAATT 


GTTGATGTAA 


ATGATGTACC TACTTTAGAT 


AAGAACAAGA 


2400 




ATCATACAAT AGAAGTTGTT 


GTAGACCGAT 


TAGTTGTTAA AGATGGAATT 


GAAACACGAC 


2460 




TAGCTGACTC TATAGAAACT 


GCCTTAGAGC 


TTTCAGAAGG ACAATTAACA 


GTCGATGTCA 


2520 


25 


TTGACGGGGA AGACCTTAAG 


TTTTCAGAAA 


GCCATGCTTG TCCTATATGT 


GGATTTTCAA 


2580 




TCGGAGAGTT AGAACCAAGA 


ATGTTTAGCT 


TTAACAGTCG TTTTGGTGCT 


TGTCCGACAT 


2640 




GTGATGGCTT AGGCCAAAAG 


TTAACAGTCG 


ATGTAGACTT GGTTGTTCCC 


GACAAAGATA 


2700 




AGACGCTAAA CGAAGGTGCA 


ATAGAACCTT 


GGATACCGAC GAGTTCTGAT 


TTTTATCCAA 


2760 




CATTGTTAAA ACGTGTTTGT 


GAAGTTTATA 


AAATCAATAT GGATAAACCT 


TTTAAAAAGT 


2820 


35 


TAACAGAACG TCAACGTGAT 


ATTTTATTGT 


ATGGTTCTGG TGACAAAGAA 


ATTGAATTTA 


2880 


CATTTACACA ACGTCAAGGT 


GGTACTAGAA 


AACGAACAAT GGTTTTCGAG 


GGTGTAGTTC 


2940 




CTAAffATAAG TAGACGATTC 


CATGAATCTC 


CTTCAGAATA TACACGTGAA 


ATGATGAGTA 


3000 


40 


AATATATGAC TGAACTACCT 


TGCGAAACTT 


GTCATGGAAA GCGATTGAGT 


CGTGAAGCkT 


3060 




TATCTGTTTA TGTAGGTGGT 


TTAAATATTG 


GTGAAGTAGT CGAATATTCA 


ATCAGTCAAG 


3120 




CGCTGAACTA TTATAAAAAC 


ATTGATTTGT 


CAGAACAAGA TCAAGCGATT 


GCAAATCAAA 


3180 


45 


TATTGAAAGA AATTATTTCC 


CGACTCACTT 


TTTTAAATAA TGTGGGACTT 


GAATATTTAA 


3240 




CGTTAAACAG AGCTTCAGGT 


ACACTTTCAG 


GTGGTGAAGC ACAACGTATT 


CGATTAGCAA 


3300 




CGCAAATTGG GTCGCGTTTG 


ACTGGTGTCT 


TATATGTATT AGATGAGCCA 


TCAATTGGAC 


3360 


50 


TGCATCAAAG AGATAATGAT 


CGATTAATTA ATACACTTAA AGAAATGAGA GATTTAGGAA 


3420 




ATACTTTAAT TGTAGTTGAA 


CACGATGATG 


ATACAATGCG TGCGGCTGAT 


TACTTAGTGG 


3480 
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AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3 6 0O 

AAGTACCTGA ATATCGCAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 36 60 

^ GCAACAATCT TAAAGGGGTT GATGTGGACA TACCACTATC AATCATGACG GTTGTTACAG 3 720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 37 80 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3 84 0 

10 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3 90 0 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3960 

CTAAAATTCG AGGATATCAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4 02 0 

IS 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 4140 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 42 00 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 42 60 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 43 2 0 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCX3ACAA 43 80 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 44 4 0 

ATGGTQATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4560 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4 620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4 6 80 

35 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 4 74 0 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 480 0 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4 86 0 

40 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4 920 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4980 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 504 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGGCGATA TGAACCATGT AAATTAAGCA 5160 

50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5220 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 5280 
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ACATTAAAGT TAOATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 54 00 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA 54 SO 

ATACAACTAT TAGGAACAAC GGAACTATCG TTTTACAATT TATTACCAGA TAAGGATCGC 552 0 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACGCCTG CAATTATTGT GACACGTGGA 5 58 0 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 564 0 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 5700 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 5 76 0 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGGCAT 5 820 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 588 0 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 594 0 

ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6000 

TTGGAAAACT GGAACAAGCA AAAGTTATAT GACCGCGTAG GTCTTAATGA AGAGACGCTA 606 0 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 6120 

GCGGTAATTA TTGAGGTCGC TGCAATGAAC TATCGATTAA ATATCATGGG CATTAACACG 6180 

GCCGAAGAAT TTAGTGAAAG ATTAAATGAA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 63 00 

CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 63 60 

TTgCACAACG TGCACTAGTT AAAGCAGGAT TACATAAAGA TACTTTAGTA GATATTATTT 64 20 

TTTATAGTGC ACTATTTGGA TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 64 30 

CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG GCATGGTGGA ATAGCAATAC 6 54 0 

ATGGTGGTTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 6600 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 6 66 0 

GCTGGGGTAA CTTTATGAAT CACGAGGCAC ATGGTGGATC GGTGTCACGC GCTTTTTTAG 6720 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC CAATATTATC 6780 

ATCCAACATT CTTATATGAA TCCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 6 840 

TTCX3TAAACA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATTCAATTG 6900 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 6 960 

TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 7020 

GGATTAAGTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7080 
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TTATGGCGTG 


TATACCGTCT 


TGTTAAATTT TCGAAAGTTT 


TTAAGAATGT 


AATTATCATT 


7200 




GAATTTTCGA AATTTATTCC 


AAGTATGGTA CTGAAAAGAC 


ATATATATAA 


ACAACTTTTA 


7260 




AATATTAATA 


TCGGTAATCA 


ATCGTCGATA GCTTATAAAG 


TAATGTTAGA 


TATTTTTTAC 


7320 




CCAGAACTGA 


TTACGATTGG 


TAGTAACAGT GTTATTGGTT 


ACAATGTAAC 


AATTTTGACG 


7380 




CATGAAGCAT 


TAGTTGATGA ATTTCGTTAT GGACCAGTGA 


CGATAGGATC 


TAACACTTTG 


7440 


10 


ATTGGTGCAA 


ATGCTACCAT 


TTTACCCGGT ATAACGATTG 


GTGACAATGT 


AAAAGTTGCA 


7500 




GCTGGTACGG 


TTGTTTCAAA 


AGATATACCG GATAATGGAT 


TTGCATATGG 


CAACCCTATG 


7560 


15 


TATATAAAAA 


TGATTAGGAG 


GTGACAATTT TATGGCGCAA 


AAGAATAATA 


ATGTAATTCC 


7620 


AATGACTTTT 


GATGATGCAT 


TTTATCGTAA AATGGCTAAA 


CAGAAGTTTA 


AACAAAGAGA 


7680 




ATATAAACGA 


GCTGCTGAAT 


ACTTTGAAAA AGTGTTAGAA 


TTGTCACCTG 


ATGATCTGGA 


7740 


20 


AATTCAAATT 


GATTATGCAC 


AATGTCTAGT GCAACTTGGT 


ATTGCTAAAA 


AAGCAGAACA 


7800 




TTTATTTTAT 


GACAATATTA 


TTTATAATAG GCATCTAGAA 


GATAGCTTTT 


ATGAATTGAG 


7860 




TCAGCTCAAC 


ATTGAAGTTA 


ACGAACCAAA CAAGGCATTC 


TTGTTTGGTA 


TTAATTATGT 


7920 


25 


TATTGTTAGC 


GACGACCAAG 


ATTATAGAGA TGAATTAGAT 


CAAATGTTTG 


ATGTGAAATA 


7980 




TCAAAGTGAA 


GAACAAATTG 


AACTTGAAGC TCAATTGTTT 


GTAGTTCAAA 


TACTATTCCA 


8040 




ATATCTTTTT 


TCTCAAGGTC 


GATTAAAAGA TGCAAAGAAT 


TATGTCTTAC 


ATCAACCACA 


8100 




AGAAGTTCAA 


GATCATCGTG 


TAGTACGTAA TTTATTGGCA 


ATGTGTTATT 


TATATCTCGG 


8160 




TGAATATGAT 


ACgGCTAAAG 


CATTGTACGA aGCACtATTA 


CAAGAGGATA 


GTACaGATAT 


8220 




ATATGCATTA 


TGCCATTATA 


CTTTGCTACT TTATAACACT 


AAGGAAAATG 


AACAATATCA 


8280 


35 


AAAATATTTA 


AAAATATTAA 


ACAAAGTTGT ACCTATGAAT 


GACGATGAAA 


GTTTTAAATT 


8340 




AGGTATTGTA TTAAGTTATT 


TAAAGCAGTA TCGTGCATCA 


CAACAATTGT 


TGTACCCTTT 


8400 


40 


ATATAAAAAA 


GGGAAATTTT 


TATCAATTCA AATGTACAAT 


GCTTTAGCAT 


ATAATTATTA 


8460 




TTATTTAGGT 


GAAGAAGACG 


AAAGTCATTA CTACTGGGAT AAATTGAAGC 


AAATTTCTAA 


8520 




AGTGGAAATT 


GGACATGCGC 


CTTGGGTAAT TGAAAATAGC 


AAAGAAGTTT 


TTGACCAACA 


8580 


45 


TATTTTGCCA 


TTACTTCAAA 


GTGATGACAG TCATTATCGT 


TTATATGGTA 


TTTTTTTATT 


8640 




GGATCAATTA 


AATGGTAAAG 


AAATTGTGAT GACGGAAAGT 


ATTTGGCAGG 


TTTTGGAAAA 


8700 




TCTAAATAAT 


TATGAGAAAT 


TGTATTTAAC GTATTTAGTT 


CAAGGTTTAA 


CGCTCAATAA 


8760 


SO 


ATTAGACTTC 


ATTCATCGCG 


GCTTATTAAC GCTTTACCAT AATGAATTAT 


TTGTAAGTGA 


8820 




AAATGATGTA 


ATGGTTGCAT 


GGATTAATCA AGGTGAACTC 


ATAATTGCTG 


AAAAAGTAGA 


8880 
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TCGAAACGTT ACAAAGAAGC AAATTACAAC ATGGTTAGGC ATAACACAAT ATAAACTGAA 9000 
CAAAATGATT GAATTTCTCT TGAGCATATA GATTTATGAA AAGTTAGATT TATTATATAA 9060 
TGCGCATAAT GATTAATAAT GAGGAGGCGT TAATAAAATG ACTGAAATAG ATTTTGATAT 9120 
AGCAATTATC GGTGCAGGTC CAGCTGGTAT GACTGCTGCA GTATACGCAT CACGTGCTAA 918 0 

TTTAAAAACA GTTATGATTG AAAGAGGTAT TCCAGGCGGT CAAATGGCTA ATACAGAAGA 924 0 

AGTAGAGAAC TTCCCTGGTT TCGAAATGAT TACAGGTCCA GATTTATCTA CAAAAATGTT 93 00 

TGAACACGCT AAAAAGTTTG GTGCAGTTTA TCAATATGGA GATATTAAAT CTGTAGAAGA 93 6 0 

TAAAGGCGAA TATAAAGTGA TTAACTTTGG TAATAAAGAA TTAACAGCGA AAGCGGTTAT 94 2 0 

TATTGCTACA GGTGCAGAAT ACAAGAAAAT TGGTGTTCCG GGTGAACAAG AACTTGGTGG 94 8 0 

ACGCGGTGTA AGTTATTGTG CAGTATGTGA TGGTGCATTC TTTAAAAATA AACGCCTATT 954 0 

CGTTATCGGT GGTGGTGATT CAGCAGTAGA AGAGGGAACA TTCTTAACTA AATTTGCTGA 9600 
CAAAGTAACA ATCGTTCACC GTCGTGATGA GTTACGTGCA CAGCGTATTT TACAAGATAG 96 6 0 

AGCATTCAAA AATGATAAAA TCGACTTTAT TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 
AAAAGACGGC AAAGTGGGTT CTGTGACATT AACGTCTACA AAAGATGGTT CAGAAGAAAC 9780 
ACACGAGGCT GATGGTGTAT TCATCTATAT TGGTATGAAA CCATTAACAG CGCCATTTAA 9840 
AGACTTAGGT ATTACAAATG ATGTTGGTTA TATTGTAACA AAAGATGATA TGACAACATC 9900 
AGTACCAGGT ATTTTTGCAG CAGGAGATGT TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

TGCTACTCGC GATGGTAGTA TTGCAGCGCA AAGTGCAGCG GAATATATTG AACATTTAAA 10020 

CGATCAAGCT TAATTCGAAG TCGAATTAAG ATGTTGAGCT GTAAATTATT TGGATATTTA 10080 

TTTTAATAGT GTCATCACAG CGTTAAAATA ATGTCTTACT TTTAAATTAA AGCAAATTAT 1014 0 

ATAG5AAACT AGAACTTAGT ACGTATCATT TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 10200 

TATGtTATAT TAAACTTATA ACTTTATGGG AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

TGATTTATTA TGTAGTGGTT CTTAAACATT AGCCACAGCT AATGTGTACT TAAAAATAGG 10 320 

AATACATGAG TAAAACTCAT GCATAAGAAA TACTAATTTC TATAGAAAAA GTATTACTTT 10380 

ATCGTTGTCC CACCCCAACT TGCACATTAT TGTAAGCTGA CTTTCCGCCA GCTTCTGTGT 10440 

TGGGGCCCCG CCAACTTGCA CATTATTGTA AGCTGACTTT TCGTCAgCTT CTGTGTTGGG 10500 

GCCCCGCCAA CTTGCACATT ATTGTAAGCT GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 10560 

CGCCAACTTG CATTGTCTGT AGAAATTGGG AATCCAATTT CTCTATGTTG GGGCCCACAC 10620 

CCCAACTCGC ATTGCCTGTA GAATTTCTTT TCGAAATTCT CTGTGTTGGG GCCCACACCC 106 80 
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ACTCGCATTG 


CCTGTAGAAT 


TTCTTTTCGA 


AATTCTCTGT GTTGGGGCCC CTGACTAGAG 


10800 




TTGAAAAAAG 


CTTGTTGCAA 


GCGCATTTTC 


ATTCAGTCAA CTACTAGCAA TATAATATTA 


10860 




TAGACCCTAG 


GACATTGATT 


TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 


10920 




ATTTAATCTA 


GACATAGTTG 


GAAATAAATA 


TAAAACATCG TTGCTTAATT TTGTCATAGA 


10980 


10 


ACATTTAAAT 


TAACATCATG 


AAATTCGTTT 


TGGCGGTGAA AAAATAATGG ATAATAATGA 


11040 


AAAAGAAAAA 


AGTAAAAGTG 


AACTATTAGT 


TGTAACAGGT TTATCTGGCG CAGGTAAATC 


11100 




TTTGGTTATT 


CAATGTTTAG 


AAGACATGGG 


ATATTTTTGT GTAGATAATC TACCACCAGT 


11160 


15 


GTTATTGCCT 


AAATTTGTAG 


AGTTGATGGA 


ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 


11220 




AATTGCAATT 


GATTTAAGAG 


GTAAGGAACT 


ATTTAATTCA TTAGTTGCAG TAGTGGATAA 


11280 




AGTCAAAAGT 


GAAAGTGACG 


TCATCATTGA 


TGTTATGTTT TTAGAAGCAA GTACTGAAAA 


11340 


20 


ATTAATTTCA 


AGATATAAGG 


AAACGCGTCG 


TGCACATCCT TTGATGGAAC AAGGTAAAAG 


11400 




ATCGTTAATC 


AATGCAATTA 


ATGATGAGCG 


AGAGCATTTG TCTCAAATTA GAAGTATAGC 


11460 




TAATTTTGTT 


ATAGATACTA 


CAAAGTTATC 


ACCTAAAGAA TTAAAAGAAC GCATTCGTCG 


11520 


25 


ATACTATGAA 


GATGAAGAGT 


TTGAAACTTT 


TACAATTAAT GTCACAAGTT TCGGTTTTAA 


11580 




ACATGGGATT 


CAGATGGATG 


CAGATTTAGT 


ATTTGATGTA CGATTTTTAC CAAATCCATA 


11640 




TTATGTAGTA 


GATTTAAGAC 


CTTTAACAGG 


ATTAGATAAA GACGTTTATA ATTATGTTAT 


11700 




GAAATGGAAA 


GAGACGGAGA 


TTTTCTTTGA 


AAAATTAACT GATTTGTTAG ATTTTATGAT 


11760 




ACCCGGGTAT 


AAAAAAGAAG 


GGAAATCTCA 


ATTAGTAATT GCCATCGGTT GTACGGGTGG 


11820 


35 


ACAACATCGA 


TCTGTAGCAT 


TAGCAGAACG 


ACTAGGTAAT TATCTAAATG AAGTATTTGA 


11880 


ATATAATGTT 


TATGTGCATC 


ATAGGGACGC 


ACATATTGAA AGTGGCGAGA AAAAATGAGA 


11940 




CAAATAAAAG 


TTGTACTTAT 


CGGTGGTGGC 


ACTGGCTTAT CAGTTATGGC TAGGGGATTA 


12000 


40 


AGAGAATTCC 


CAATTGATAT 


TACGGCGATT 


GTAACAGTTG CTGATAATGG TGGGAGTACA 


12060 




QGGAAAATCa 


GAGATGAAAT 


GGATATACCA 


GCACCAGGAG ACATCAGAAA TGTGATTGCA 


12120 




GCTTTAAGTG 


ATTCTGAGTC 


AGTTTTAAGC 


CAACTTTTTC AGTATCGCTT TGAAGAAAAT 


12180 


45 


CAAATTAGCG 


GTCACTCATT 


AGGTAATTTA 


TTAATCGCAG GTATGACTAA TATTACGAAT 


12240 




GATTTCGGAC 


ATGCCATTAA 


AGCATTAAGT 


AAAATTTTAA ATATTAAAGG TAGAGTCATT 


12300 




CCATCTACAA 


ATACAAGTGT 


GCAATTAAAT 


GCTGTTATGG AAGATGGAGA AATTGTTTTT 


12360 


50 


GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC GTGTGTTTTT AGAACCTAAC 


12420 




GATGTGCAAC 


CAATGGAAGA AGCAATCGAT 


GCTTTAAGGG AAGCAGATTT AATCGTTCTT 


12480 
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GCXJTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 12600 

GAAACAGATG GrTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12780 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 1284 0 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 1290 0 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 12 960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13 080 

AGACGT 130 8 6 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 60 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 12 0 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 18 0 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 24 0 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 36 0 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 42 0 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 480 

CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 540 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 660 

TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 780 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

s 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 102 0 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 108 0 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 114 0 

W 

CGCTGCCTTT TGACCATCAT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 1200 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

,5 TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 1320 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 

(2) INFORMATION FOR SEQ ID NO: 18: 
20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

TAATGCTATT GGCAACACCA TATATGAAAn CTCCAAACGA TCCTAAACCG ACTATAGATT 60 

30 

CACCAAATTT nACAATCCAT GAATAAAGTA GTGGCCATAA GAATAACAAT ATGACAACTA 120 

AAAATGTACA GTAAAATGCA GTCATAATTG GAACTAGACG TTTACCACTA AAAAATGATA 180 

ATGCTAATGG TAATTCTGTT TCACTAAACT TATTGTATGC ATAAGCTGCT ATTAAACCTA 24 0 

35 

TTACAATACC AACAAAGACA TTGCCATTAT TCATCTTTTC AAAAGCTGAA TTTATTTCCG 300 

ArGCTTTCAT TCCTAATAAA GGCGCTAATT TCATTGGTGA TAATACAACT GTAACTAAAA 360 

^ AATATCCTAA CGTrGCTGCA rGCGsGACTG CACCATCATT TTTCTTTGCC ATTCCTATAG 420 

CTACACCAAT TGCAAATAAA ATACCTAATT GCTCTAAAAT CGTAGTACCT ACCGTAGTAA 4 80 

AGAACATTGC GATTTTCGGC GTCGCATGAA GTGCATTTAA CGTATTACCA ATTCCGGCAA 54 0 

45 TAATTGCTGC AGCCX3GTAAA ATGGCAACTG GTAACATTAA CGAACGCCCT AAATTTTGGA 600 

AAAATTTATA CATTGAATGT CATCCTTCTT AAAATAATGT AGAAATATAA AGATTACTAA 660 

TGTAACTAGA ATAACTACTT CGATACTCCG TTATAGTCAC CTAGGCTTAC TAACCAGCTA 720 

SO TATTTCTACC TCAAGTTATT TTATAAACTT TTTACAATTT CATGCAATTC TTGTTGTAAC 780 

TTTGCTGTTC GTGTTTCAAT CTCTTTTGTA ATATAATCGA TACGCTCGTT TCGTTTTAAA 840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 96 0 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 1080 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 1140 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 1200 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 126 0 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 132 0 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 60 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 12 0 

ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 24 0 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 300 

TACTATATGT ATAGGGGTCA AGCCATGAAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 3 SO 

CTAAJAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 80 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 54 0 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 600 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 780 

ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 840 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC 1020 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCCTTT TATACTCACT AATGTTTATA 1080 

^ TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AGCTATAGAT 1140 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 12 00 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 12 6 0 

W 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 13 20 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 13 80 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 144 0 

IS 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTACCAATG 15 00 

ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGTCCTAA TGGTGAAGAT GGCACGATTC 1680 

AAGGGCTTTT TGAAGTnTG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 174 0 

26 GTTCTATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC 1800 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG 1860 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

^° GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTGACCGTAA GCTTGTTATA GAACAAGGCG TTAACGCACG TGAAATTGAA GTAGCAGTTT 2 04 0 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

35 

ACGATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 222 0 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 228 0 

40 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 234 0 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC 24 00 

AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT 24 60 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 252 0 

AGAGATAAAT GGAGTCACAA TTGATTCACG AGCAATTTCT AAAAATATGT TATTTATACC 258 0 

50 ATTTAAAGGT GAAAATGTTG ACGGTCATCG CTTTGTCTCT AAAGCATTAC AAGATGGTGC 264 0 

TGGGGCTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GGCCTATTAT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2 8 20 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 2 8 80 

AATTGGTTTA CCTTTAACTA TTTTGGAATT AGATAATGAT ACTGAAATAT CAATATTGGA 294 0 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3 000 

TGCAGTTATA ACTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TGGCGATGAA CCATTATTGA AACCACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 3180 

TATTGGTGTT GCTACTGATA ATGCATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 3240 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 34 80 

AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 354 0 

AGAATTAGGT GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 36 00 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 36 60 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 378 0 

AGAAGTGGTA AATGCTTTAA TTTCATAGAG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3 84 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 9 00 

TTGCCTTTTT CTTTTTATGT TAAATCTATA AATTTGAAAC TAAATCAAGG TTAATTCTAT 3 96 0 

GTACACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4 020 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

ATACGTATTT TATAAAAelAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 42 00 

TATTCGCAAA TTGCTTTATT GCGATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

AAATATTAAT GAACTTATAT GCAAAAGTAT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4320 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 43 80 

AATGGGATTT AAAGAGCCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 444 0 

AATTGATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 4 62 0 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 468 0 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 4 74 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4 800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4 8 60 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 92 0 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4 980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 504 0 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 516 0 

TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 522 0 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340 

CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 54 00 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AGCACGTGAA 54 60 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5520 

CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTTAGTTGC TGCACTTTTA 55 80 

CAAGAGTTAG TAGAAGCARA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 564 0 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 57 6 0 

ACAAAAAAAT TCGACCGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 582 0 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 588 0 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 5940 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCAT G TTCTT ATAAAAAACA ATAGGGCTTT 612 0 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCT^TGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 642 0 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 64 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6 54 0 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 6 6 00 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 66 60 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 6780 

,5 TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 6840 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 6960 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 7020 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 7200 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 73 2 0 

^ TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 73 63 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
- (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 60 

45 CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 120 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGGCAAATGT 24 0 

SO TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 300 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 360 
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ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT SCO 
AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 66 0 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTGATCT AACAATCCAA 72 0 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 78 0 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 
TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CX3AGTGATAA TACGAaGJcGG TaTCATACCG 1020 

ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 1080 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 12 00 

TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 12 60 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 13 80 

GAATCTCTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 144 0 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 15 00 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 16 20 

CAG^TGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 16 80 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 174 0 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 204 0 

CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 23 4 0 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 24 60 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 276 0 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2820 

ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 28 80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3060 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3 240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3 3 00 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 33 60 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 3420 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 354 0 

GAATACCACC ATGTCXJCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 372 0 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 3780 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 384 0 

CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 4140 

^ ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4 32 0 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4 380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 44 4 0 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4 500 

IS 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4 56 0 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 620 

2P CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 4 680 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4 74 0 

TCCAAATTTT AACCGTCGGT TGAGATGCX5C TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4 8 00 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4 8 60 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 920 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4 980 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 504 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

35 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 52 2 0 

CGTTTTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 52 8 0 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

40 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 54 00 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 54 50 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 5520 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG ATAACCTTTT 55 80 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

SO TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 57 00 

GATCGATACG ACCTTGTTTG TCATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 57 60 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT SB 80 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5 94 0 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6 000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6 0 60 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 624 0 

TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 64 20 

GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 64 8 0 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 654 0 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 66 00 

GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 666 0 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 672 0 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 678 0 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTAT CT CATATTTGGC 6 84 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT CATTTTTACT 6 900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6 960 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 702 0 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 708 0 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 7200 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 7260 

TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 732 0 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 8 0 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 744 0 

AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7560 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 76 8 0 

AGCGATATAT AACGCCCATT CAACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7740 

^ AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATCCTAAA 7860 

ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7 920 

10 

TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 79 80 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CATCTTCTAC CTTGTCGCGC TTTAATTCTT 8 04 0 

CAAAATTTCT ATCTAATTTG TCATAAATCT TTTCTTGCGC TCTAAGACTA TCTTCTATTC 810 0 

IS 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 8160 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

ACAAGCATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT CCAGTGATTA 8280 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 8 340 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 84 00 

25 TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 8460 

TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 8520 

TTTGCATACT CGCAACCATT CCGCGAAGTT CCTCATCACT TAAATCTGAC GCACTTTGTT 85 80 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA ATTTCGCCGT 864 0 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 87 00 

CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 876 0 

TATCGTTTAC TGTGATTTTC ATTATTTCCA CCCCATAATT TTAGTTATAG TAACTTTGTT 8 82 0 

GGCA3TCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 88 80 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 894 0 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9 00 0 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 906 0 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 924 0 

AACGAATACC GATAAATAAC CCTCATAACT TTCAACGCTA CCTGGTAAAT CCGGCACTCT 93 00 

TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 936 0 

55 
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GAATTTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 94 80 

AAATTGCTTA GTTAAGTTTC CATCATTCTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

TTTGTATTTT GTGTACTCAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAGCAGTATT 9600 

ATTATCAACA ACATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 96 6 0 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 9720 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 97 80 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 984 0 

TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 990 0 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTGCACTAGT TTGCGTCGCA GTAATAGCTT GTATAGCTTC 10020 

GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTGCTAT 10080 

TAACGTTTGT GTATCAGCCA TATTTTGCTT TAATTGGTTA AAATCTTTAC CGACAGCTTC 10140 

GATAGTATCT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

TAAATCATTT TCAATATTGA AGCTAAATTG ACGTTCAACA ACAACATTAT TACTCCCGTT 10260 

TTGTGTAAAG AATGCCTGAG CATGCACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 10320 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 10380 

GCCTCTATCT ACGTTATAAT CATCGGTTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 10440 

ACTTATAGAT AAGGGTCTGT TATnCTTAGT 10470 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 64 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 
CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 
AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 
AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 
TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 



240 
300 
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TCAAATTGTA ACAACTAATC CTATTGCAGG TACGATTCAA CGTGGTGAGA CGACACAAAT 420 

AGATAATGAG AATATGAAAC AACTACTTAA TGATCCAAAA GAATGCAGCG AACATCGTAT 4 80 

GCTAGTTGAT TTAGGACGTA ATGATATTCA TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 540 

TACTAAATTA ATGGTTATTG AAAAATATGA ACATGTTATG CATATCGTAA GTGAAGTCAC 6 00 

AGGTAAAATA AATCAAAATT TATCGCCAAT GACAGTTATT GCGAATTTAT TACCAACAGG 660 

TACCGTTTCA GGTGCACCAA AATTACGTGC AATTGAAAGA ATATATGAAC AATATCCACA 720 

TAAACGGGGC GTTTATAGTG GTGGTGTTGG ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGCATTAGCA ATTCGAACGA TGATGATAGA TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 84 0 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

AAGCTTATTG GAGGTGAGCC CATGATCTTA GTTGTAGATA ATTATGATTC CTTTACATAT 960 

AACCTAGTGG ATATTGTTGC TCAACATACT GACGTCATTG TTCAATACCC TGATGATGAT 1020 

AATGTGCTGA ATCAATCGGT GGACGCTGTT ATTATATCTC CTGGTCCAGG GCATCCATTA 108 0 

GACGATCAAC AGTTAATGAA AATCATATCA ACCTATCAAC ACAAACCCAT TTTAGGTATT 114 0 

TGTTTAGGGG CTCAGGCACT GACTTGTTAC TACGGTGGAG AAGTCATTAA AGGCGACAAG 12 00 

GTTATGCACG GCAAAGTTGA TACACTAAAG GTTATATCGC ATCATCAACA TCTGTTATAT 126 0 

CAAGATATAC CAGAACAGTT TTCAATTATG AGATATCATT CATTAATAAG TAACCCTGAC 132 0 

AATTTTCCAG AAGAATTGAA AATTACTGGA CGTACCAAAG ATTGTATACA GTCATTCGAG 13 8 0 

CATAAAGAAA GACCGCATTA TGGTATTCAG TACCATCCTG AATCATTTGC TACAGACTAT 144 0 

GGTGTCAAAA TAATTACAAA TTTCATTAAT CTAGTGAAGG AAGGATGAAA ACCATGACAT 150 0 

TACTAACAAG AATAAAAACT GAAACTATAT TACTTGAAAG CGACATTAAA GAGCTAATCG 1560 

ATATfiCTTAT TTCTCCTAGT ATTGGAACTG ATATTAAATA TGAATTACTT AGTTCCTATT 162 0 

CGGAGCGAGA AATCCAACT^ CAAGAATTAA CATATATTGT ACGTAGCTTA ATTAATACAA 16 8 0 

TGTATCCACA TCAACCATGT TATGAAGGGG CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

AGTCAAATAG TTTCAACATT TCAACGACTG TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

AAGTTATAAA ACATGGtAAT AAAAGTATTA CCTCaAATTC aGGTAGTACG GATTTGtTAA 186 0 

ATCAAATGAA CATACAAaCA ACAACTGTTG ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ACCTTGTATT CATTGGTGCA aCTGAATCAT ATCCAATCAT GAAGTATATG CAACCAGTTA 1980 

GAAAAATGAT TGGAAAGCCT ACAATATTAA ACCTTGTGGG TCCATTAATT AATCCATATC 2040 

ACTTAACGTA TCAAATGGTA GGCGTCTTTG ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

^ GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 2340 

CAAGTCGAOS TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 2460 

'° TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2 5 BO 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2 7 00 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2820 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2880 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2 940 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3240 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3 3 00 

CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3 360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 34 2 0 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 80 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3 540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3 64 7 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CcAcCTTGAC CACCTTTACG TGGAATCTTT TCmCCTkGAG CAACaTCGaT AATaTATATT 6 0 

GAAAgTCAAC AAGTTCTGGA CTAAATGTTG CTGCTAAGTT ATCGCCACCA GATTCTATGA 12 0 

AAATTAGTTC TATATCGTCA TGACGTTCTA ATAATTCGTC TATTGCTGCA AAGTTCATAG IBO 

ATGCATCTTC ACGAATCGCA GTATGAGGAC ATCCACCAGT TTCAACACCA ATGATACGAC 24 0 

TTTCAGGTAG AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 3 00 

TTGTAATAAC GCCGATACTC ATTTCTTTTG AAAGACGTTT TACAACTTTT TCAATTAATT 36 0 

GTGTTTTACC TGCACCTACA GGACCACCAA TACCAATTTT AATCGGATTT GCCACAATTA 42 0 

TAACCTCCTA TGATATGAAA tTCTAACATT GaCGTTCTCA TGCGCCATTT GATTTAGTTC 480 

TAAACCAGGC GCTGTCATGC CAAAATCTGC TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 54 0 

TCCTTCTATA TAAGGAATCA TGTGAGTAAC TATCTTTTGA CCAGCAGTTT GTCCAAGTGG 600 

AATAGCACGA ACAGCATTTT GAGTTAAACT TGAAACATTT TGATATAAAT AGTAATCAAT 660 

AATCGTTTCA ATATCTACAC CTAAATGATG GCCTAGCATA GTAAAACAAA TAGCTGGATT 72 0 

TnACTTTGCT TTCTTATCTT GCATTTGTTG ATGATACCAA GCAATCCATG GGCTATtATA 78 0 

AAGTTCTAAA GCCAATTTAA CCATGCGAGT CCCCATTTGT kTTGCACCAA CACGTGTTTC 84 0 

TTTAGGTAAG TTTTGrACAr ACATCAGTTT ATCTATGTGT AATACTTTTT GTGTATCATC 900 

ATTTTCCAAT GCATCATAAA CTAaACGCAT GGCTAAACCA TCAGAATAGG TAAGTTGCTC 96 0 

TTGTAAAAAC ATTTTTAACC AAGCAATAAA AGTATGATCG TCATGAATTA TATTTCGTTG 1020 

AATATATGTT TCAAGACCAA ATGAATGACT GAAAGCACCT GTTGGAAACT GTGAATCACA 1080 

GAACTGAAAT AATCTTAAGT GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 114 0 

CCT'IATTAAC TTTACGGTCT TCTCGAACAT ATGGGATGCC TAAACTTTTT AATAAATCTT 1200 

CAACTAAATA ATCATATTGT ACTAGCATTT CAGTCTCTGT AAATTGTGCT GGCAAATGAC 1260 

GATTTCCTAA TTGATGGGCT ATATCTCCCA TTTCTTGCAA TGTTCTTGGT TGAATCACTA 132 0 

AAAGATCTTC TGAATTAACA TCCACAATAA TCATATTATG GTCATCTGCG TATAAAATAT 1380 

CTCCATATTG TAAGTCAATA GGTTGTTTTA AACGAATGCC TATTTCAGTG CCATGGTCTG 1440 

TAACGACTCT TTGAATACGT TTAACAAGAT CTGAATTTTC AAGGTATACT TTTTCGACGT 1500 

GCTTTTGTTT TTCTGAATTT GACAAATTGG CAATATTGCC TTGGATTTCT TCAACAATCA 1560 

TTCTATGTTC CTCCTAGAAT AAGAAGTATC TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 1620 

TACTTGTAAT TTTTTCTCCA TCTACATATA CTTCATATGT TTGTGGATCA ACGTCTAATT 1680 
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GACGCACCAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT I8 60 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 192 0 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 198 0 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 204 0 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 2100 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 216 0 

GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 234 0 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 24 00 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 24 60 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 25 80 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 264 0 

CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGGTAAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2820 

TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3 00 0 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 312 0 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 3180 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 324 0 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 3300 

TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3360 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA TACCACGAAA 34 20 

ACCAAAAATT TTACGTTTGC CAGCATATTC AACTAATTGA ACTTCTTTTT TATCCCCAGG 34 80 
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TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 36 00 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 366 0 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3 72 0 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3 7 BO 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 384 0 

TCATT7VACTC TGCAACXjGTC TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 396 0 

CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4 020 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4080 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 414 0 

TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 4 2 00 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 42 60 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4 320 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 4 380 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 4 44 0 

ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4 500 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4560 

TATTAACGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGACCATAT AAAGTTCCGA 4 620 

TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 46 8 0 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 474 0 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 8 00 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4 860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 920 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4 980 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCXSACTTTT TTAGGTGTGT 504 0 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GGCTTACCTG 5100 

CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 5160 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 52 2 0 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 52 80 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 54 00 

ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 54 60 

AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5580 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 564 0 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 57 00 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 5760 

GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 5820 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 5880 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 5940 

AAAGTGTTGG TTTTTTCTTA GTAGAC 5 966 



(2) INFORMATION FOR SEQ ID NO: 2 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 60 

ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 120 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TACCTAACTT 180 

AAAGSTGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 240 

AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 300 

AATCATACGA TATGTATACA AAATAATGAm AAACTGTnAA AAATGATTTG CCTTTAATAA 36 0 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 420 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 4 80 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 54 0 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 600 

TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 660 

TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 720 
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TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 84 0 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 90 0 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 960 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 10 80 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TTAGCGTTTG CTTAACAGCG GGAACATCTG 114 0 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACGCTAACT 1200 

GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCXS 1380 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 144 0 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGACCTACT AAATGTTCGA 150 0 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAATCAT TGGTCCATGA TTGCCTACAA 15 60 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

AATGCTATAC TTTCAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 1740 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGTCAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2 04 0 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACGTTCCGGG CGTGGTCCAA TAAAACTCAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

TAATTCATCA ATGCXJTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 2340 

CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 2400 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 2460 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 2S4 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2 700 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 276 0 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2820 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2880 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 294 0 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3000 

TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 30 60 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 324 0 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 33 0 0 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 33 60 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 34 20 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCX3ACCC AATAAAACCA GCCCCACCAG 34 80 

TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 354 0 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3600 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 36 60 

TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3 72 0 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3 780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 384 0 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4020 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 414 0 

TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4 260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4 320 
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GCCAAATTGC GCC3GCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 444 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4 560 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 462 0 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4680 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4800 

TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 980 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 5040 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 5280 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 5340 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 54 00 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 54 60 

CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 5520 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 564 0 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 5760 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5820 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 5940 

TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6060 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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TTCCAATTGC 


GCAGTTGTTC 


AACATCATCA 


TCTTGTTTAA 


GTAATGCCAG 


TGGTACTTGA 


6240 


AGATTAAGAC 


ATCGTCCTGA 


AATATTAAAG 


CGTGTCACAC 


CTGCTGGCAC 


AGTTTCCCCT 


6300 


TTATGAACAA 


CCGCTTCAAT 


TTCCTTATAA 


CTCAATGGCT 


GATACTTCAT 


GAGTACATCT 


6360 


TGTTGAGAAA 


GACAAGGATA 


TGTACCTTGT 


GCAATTCTCT 


CTACAGAACA 


ACAACCACTA 


6420 


TAACTTGCGA 


CAACCTTTTC 


CCATACTTGA 


AAATGTGCTT 


CGCCTAAATC 


TTTTGTATAC 


6480 


AAATATTGTT 


CTGTATCACC 


ATGACACATT 


GTAATAAATG 


GCGCTTCTTG 


TCTTGTCTCA 


654 0 


GTAGTCCATG 


GCAAGCGATG 


TTCTTGTTGT 


AACGTTTCCC 


ACCACACACC 


AAATGGAACT 


6600 


TTATGTTGCC 


ATGTACTAAT 


TGAATATTGT 


GTTTCATGGA 


TTTCTTGCAC 


TGGAACTTTC 


6660 


TTACATCCTA 


ACGCTTTCAA 


ACTTGTATAC 


CGATGCACAC 


CATCTATAAC 


CATATATCTA 


6720 


CCATGTTGCA 


TCGCTGTCAC 


TAAAATAGGA 


TGACGTATAA AATCATCTGC 


TTCAATACTA 


6780 


CTTTTCGTTT 


TTTCCAATCT 


TAAAGGTTCG 


AATGTTTCGT 


GAAGATCAAT 


CTTATCTACT 


6840 


GGTACCAATT 


TTAAATGTTC 


ATGAATATGA 


TTCAATAGTT 


ATTCATCCTC 


CTTTGTTTGT 


6900 


GTTAAATAAA 


TAAATTCAGG 


ATGTGGATGG 


CTTAAGAAAT 


CGTGATGTGA 


AATAGACCAT 


6960 


CCGTATGCAC 


CTGCATATTT 


GAAAACAATA 


ACGTCGCCTG 


TACTGATTGC 


GTCTATCTGT 


7020 


ACTTCTCTAG 


CAAAGACATC 


TTTCGGTGTA 


CATAATTGAC 


CGACTAACGT 


TGTGTCCTGT 


7080 


CTCGAAATTG 


AAACTTTTTC 


AAATGAATAT 


GGATTGTCCT 


TATAGCGATA AATGTCAAAA 


7140 


GGATGGTTAT 


GTTGCCAAGA 


TACCGGCAGT 


CTAAATTGTT 


GCGTACCTCC 


TCTTAATATG 


7200 


GCATACCAAG 


CACCATGTAC 


TTTCTTAATG 


TCTAGCACTT 


CTGTCACATA GTAACCAATA 


7260 


TGTGCCACAA 


TAAAGCGCCC 


ACATTCAAAG 


TTCAATGTCA 


CATCTTCCAT 


TTCTTGCTCA 


7320 


ACGATAAGTG 


TTTTAAAACG 


TTCTACAAAA 


TTATCCCATT 


CAAATTGGTT 


AGTTAAATCT 


7380 


GCAlS^GTTAA 


CGCCTATGCC 


ACCACCAAGA 


TTGATATGTT 


TGAGTGGAAA 


TCGATGTTTT 


7440 


TCAGACCATG 


CCTTTGCTTT 


TTTAAAATAA 


AGTTTCACTA 


CATCGACATG 


TAAATTCGAG 


7500 


TCTAAATTGT 


TAGAAATAGA 


ATGAAAATGA 


AATCCATCTA 


GATGAATCTT 


TGGCATTGCG 


7560 


AGCGCAgcTT 


CAATGACATC 


ATCAACTTCG 


TCTTCAGAAA 


TACCAAATTG 


TGTTGGGCGT 


7620 


CCTGCCATAT 


GCAACGTTGC 


ATTGGGAAAT 


GGTCCTGCTA 


AATTAACACG 


Cy^TAAAATG 


7680 


TGTTGTGTCT 


TATCTTCATC 


TTCTAAGATG 


GCATTTAGCC 


GTTGTAATTC 


ATGCATACTT 


7740 


TCAACATGAA 


TACGCTGAAC 


ACCTTCACTT 


ACTGCATATC 


TTAGTTCCTC 


GTCTGTCTTA 


7800 


CCAGGGCCAC 


CAAAAATAAT 


ATGATTTGCT 


GGTTTAAAAG 


CAAGACCTTT 


TGCTATTTCA 


7860 


CCTTGAGATG 


CAACTTCGAA 


TCCTTCAACA 


TACTGACTAA 


TTGTATCTAG 


GATTTTTCGT 


7920 
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TGTTGCAAAT GATGTTCCAG TCCGACTAAA 
TGTGCTTTTA ATTGTTCAAT AACAGGTTGA 
^ GTTTAGACGT CGCTAGAGAT GCACTTAAAT 

AAATAAATGT TTGTACACCT TGTGCCTGCC 
ATGCACAAAA ATGTTTACCA TGTGCATTCA 

10 

TTACTTGATC ATCACGCGTT TGCCATGGTA 
CTTCGACTAT CATGTCTAAA CCTTCGACTT 

,5 CAACATCTTC TATCATGGCA ATCACCATAA 
GTAATGGTGT ACGTCCAAAT CTTGCCATGC 
GGTAATAACG ACTTAATTTC ACAATATGCT 

^° CAATAATACC TCTCGCACCC ATATCCAACA 
TGACACGTAC AATTGGTATA ATATGCGCTG 
TCTCATCATT AATCGCCACG TGTTCTGTAT 

25 

CGATAACCTC GATCATCAAT GGGTCCGGTA 
CATTGTTTAA TCTATGTTTC AGAGATAGTT 
GGATTTGTAA CATGATGAAT TCTTAACTCG 

30 

TTTTCAACTT GAATCGTAGG TTCAAACAAA 
AATGCTTCTT GATACGCCTC GATGATGCCT 
35 ATACCATATT GCTTTTCAAT AAATAAGATG 
TCATGTAAAA AGTCGCGTAC TAAACGTTCG 
ACTTPTTTAT GTGCTTCTGG CATTGGCTTT 
TGctCACGCT TAAAACGAAC ACCATCATGG 
CCATTTTCAT GAATGAGCAT CATATTTTGT 
TAAAGCATAT GAATCATTGG ACGAATCGCT 

45 

GAACCATATT GTTTAATCCA ATTTTCAATG 
AGTGCATTAA ATGGTATCGC ATCCTCTTCA 
50 CATATAACAC CTAACGCACC ATAAACTTGA 
AAATAAGACT GTCCTAAGAC TTCCCCTAGA 
ATATCTTGTT GCTGTATCTG CTTTAACCAA 

55 



TCATAGATAT AATGACAAAC TGGATGAGAT 8040 

ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

ATTTTGCAAT ATCTTCATCT TCACGTGGTA 822 0 

CAACTTCAAA AATATGTTGA ACATGTGATG 8 280 

TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 8 340 

GTGCTATATC GTCAATGGCC ATAACCCCTT 8400 

TATGCTCATT AGCCATCTCC ATTGCATCAA 84G0 

GACCACCATT CAAACTTCTT AATCCTTGCG 8520 

CAACTGTCTC ACGATCTTTA ACGTGTGGCA 8580 

CTTTAATGAT ATCTCTATCT ATCACTGCAG 8640 

CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

CAATCACAAC AAAGTCATAC CCGCTTGCTG 8760 

TAGAATTAAA AATGCCATAA ACTGAATCAC 3820 

GTTGCATCAT TGATACCTCC TACACCTAAT 8880 

GAGTCACTTA ATAATCGACG TGTCGTTAAC 8940 

TCGAAATGTT GATAGTTATT CAACTCTGGA 9000 

TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

TCATCTGTTT CAATAAATGA ATTACTATTC 9180 

AATGTCAGGT GTGAAGCAGC TTCACTTAAA 924 0 

AAATCTTTTA AGGCAATACG TGTAGGCCAA 9300 

GCATGCGATT CAAAGGCAAT ACCGTGATAA 93 6 0 

ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 94 20 

AATGGTACAC CATCCTTATC ACTTGCATAA 94 80 

TCGATTAACA TATGATATAT ATTTTCACGC 954 0 

GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9600 

AAAACTGTCT TTAATTCATC TTTTAAATAC 9660 

TCCGTAATTT GCGCTGCATT TTCAATTGTA 9720 
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TATTTTGTCG TGTCTATTGG CGACATCGTA CGAATCGATT GTTGAGGGTG ATATAGCTCA 984 0 

TCACTTTCCC CTAACCATAG TACTGTGCCA TTAAGCCTTT CTTCAGCCAA ATCAACTTGG 9900 

ATGACATGTT CAAACTGCCA TGGGTGTACA GGTATCATCT CAACATCATT TACATGTTTG 9 96 0 

CCAGATGCTT CAATTTGCTG TACAAAATGT TCATAAGTCT TATCGCCAAC TTGTTGACGT 10020 

AACATTTCGT TAACTACAAC ATTTCTTGAT ACCGTCGTTT CTACTTTATC TTTGTCGATA 10080 

GCTAACCACT GCAGTTTAAC GTTTGGTACA AAATCAGGAC CAAATTTCAA ATTATCACTC 10140 

AACGTAAATC CTAAACGTGA TTTGTAACTT GGATGATACT GATGCCCTTC CATCGCATAA 10200 

AATTCATAGT CGTTAAATGT CTCAGGTGTT GCTGGTGGGT TTGATTCTCG ATACTGCATA 10260 

CTTTGCGTAT CTTTTAATTC TGTCTGTAAT AACTCGACAA TAAATTGTTC TAGCTTTTCA 10320 

TCATTTTTAG GAAATGTAAA TACAACCTCT CTCAATAATT GTGTATAGTC TGTTGTTGTA 10380 

TCTGCCTCAT CTCCTACGAC ACGCTCAATT GGTGATGTGA TACGTATACG ATCAAAGCTA 10440 

TGTGTCTTTT CAGCAGTAAA ACGATACTCT GAATCATGTC CTTCTATTGT AAAATGACCG 10500 

ACACCGTCTT GATATGACGC TTTATACACA ACAATATTCT CATAAATAAG TGATGATACC 10560 

AGTTGGTGCA TCACTCTAGT CTTTACACGA TTAAGAATTG TTTGATTCAC AATACGATAC 10620 

CTCCTTGTTA TGACAAATTG GATTTGGTAT ATGTGTATAA ATAGGGTTTG CACCACAATC 10680 

ATTCAATTTA CTCATCAAAT TCGCTTTAGC CGcAATGGTC GGCGTTTGAT ATAAATCTTC 10740 

TACACAGTCA ACAAATACTG CGTTATTCGC GTATTCTTTT TTCCAAGTCA TAAGACGATG 10800 

CGCTACAAGT TGCCATAACA CAACTTCATT TCTAGTCGCT TTACCAATAG TTGATACTAA 10 860 

ATGTCCTAAG TGATTTACTA CAACGTAATA TTTAAGACGA TGCCATGCTT CATCATGTGC 10920 

ATATACAACA GGGCTTGATG CTGCCACAAC ATTTGGCACA AGCTGTTTTT CAGTAGCAAT 10 980 

CGTTCTAGAT AGACAAATGC CTTCAAGATC TCTGACAAAG CATACGTCGG GTATGCCATC 11040 

TTTTAATTCA ATTAATGTAT TTTGTACATG TGCTTCTAGA CTAATGCCTG TGTTACTAAA 1110 0 

CAGCTTTAAT ATCGGCAATA ATGTACGATT CAAATAACAT TCAAGCCATG CTTCTGGTGC 11160 

TAAACCACTT TGCTCAATCA CTTGTGATAA CTTAGACATC GGTGAATCAG GCATCGTTTC 11220 

AAATAATGAC GCCAATACAT GAATATCTTT ATCAGCATGG TAATTCGGTA TCCCTTCACG 1128 0 

AACAATCATG GCACTATTTG TTAATAAATC CATTTCAGGT TCAACTGTTT GCCCTAATGG 1134 0 

ATTCGGTAAC AATGCACGAT ATCCTTCTTC AAACATCAAT TTAAAATGGG GTGTTTCAAC 114 00 

CTCATCTTTG ACTGATGCGA TAACTTGCGC GGCATCAATT GTCCGTTCAA TCTGTTCAAG 1146 0 

GTCATTCGTA CGTATAAAAT TAGTGATTTT AACGTGTATC GGTAATTTTA AATAAATGTT 1152 0 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT 
CACATTGATT TGATAAGGAT GTGTTGGTAA 

^ ATCTATGTCT GCTAATTGAT ACAACACTTT 

GCGCGTGAGC AGAACATCTT GATGCACAGC 
TTCGGGTGCA TATTTCTCTA AATCTGCTTC 

to 

ATGAAATGGA TGACCTAAGT ATAAAGATTG 
GTCTATTGTG TTACTTTGCA AATAACGTGC 

,5 CATAATTTGC GCCATATGTT GTTGCACTGC 
TTGCAAAATA CGCGCAATTG CTTCTTTATA 
AAGCCATACC TCTGGATGAT ACATATGATG 

^° CGTTAAAGTT TCGAGCTCTG ATAATTGTAT 
ATATAAATTT TCTTCTCTAA AATATTCATT 
ATGTTGTATT AATTCTTTAT TTTGCACTTT 

25 

TGTGATCGTT GATTTGATTA GTGATGGTTG 
ATACTACGCC CATAACGATA AACGTAGTAG 
cACTaAGACT GCCAATAATT TGACCAACAA 

30 

TGCCTTTAAG TTGTTGATGA CACGCATTCA 
CACTATATGT TAATCCTTGA AGTATTCTTG 
35 AACCTTGCAG TATCGCACTA CAACCACATG 
CATATGATTT ATCATTAAAG CGTCCCCATA 
ATGCGGACTG TAAAAATCCA ATCACACTAC 
AAGCAAGTGG TGATAATGCA GTTAGCATGC 
CGATAATAAA TCGACATGTT TGTTGTGTGC 
CTTTATTAAT ATTTGGTGTT TGTGATTTTG 

45 

CACCGAAAAT ACAGACAATA AAAGTAATAA 
CTAATATCGA AGCTGTAACA CCGCCAATTA 
50 AACTTTGCAG TCTTCCTAAT ACCTTTCCAC 
ACGCACTTGA TGCATCAACA ACACCACCAA 
ACTGTAATGG TGTCGTACAC AATGCCATTA 

55 



ATATTGCATA TACTGTGGAT GCTGTCX3CAA 1164 0 

TAAAATAAAA TCTTTGGGTA TCTCTGATAT 11700 

CTCAACCTGA TCTTCTTTAC CTTCTACATA 11760 

TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11880 

TTCTGAAACG ATATAACGAT CCTCTACGTA 1194 0 

CGTGCGATGA ATGCTATTAT CGATGTCAGA 12000 

CGTTTGATTA TCTGCACTTT GAGCCATATG 1206 0 

AGTTGTTATT TTTTTACTTT TTCCATCGAT 12120 

CCCCATCGCA GACCAATAGC GAAATTCACC 12180 

AGACCATTGA TGATTTTGAG GTGGTACTTG 12240 

TAAAATGCGT TCGATAGCCG CATACGCTGC 123 00 

TTTGTTTCAA CTCCCATAAT TTCATTAATG 123 6 0 

AACAAATTAA AAATAAACTA CTTACTGCAA 124 20 

CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

CTAACATACT GTTCGTCGTT CCAACAAATG 1254 0 

CGACAACAAA CATGACACTT TGAATCAATG 12600 

CAGCCATTAA AAACTCTATA TTCGTCGCTA 12660 

CAATCGTGGC AAATATATAT ACTGATTTAA 12720 

AAGGCGCGCT TAATATCGAA GCCGTCCAAA 1278 0 

GGTCATCTAT CGCTGTATGA TTCACTGATG 1284 0 

CATACATAGC AAAGTTTGCT AAAACGCCAA 12900 

ATAATAGACA TTGAAATGAA CGGCGAATAC 12960 

GCATATGTGT CGTTTCAATC AATTTTAATG 13 020 

CGGCAATACT CATCAGTAAC GCACTAAAAC 13 080 

ATGGCCCCAC AAGAGACCCT GCGCTGACTG 1314 0 

GATCTTCAGC TGGCGCCTCT GCACTCGCAA 13200 

ATAGTCCCTG CAATAACCTC ACAAGTACAA 13260 

AAAATAAGCA TACCGCCAAA CCAAGTAACG 13 320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13500 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13680 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13 800 

AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13 860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13 920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13980 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AGCTGTTTCA 14 340 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 14400 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 144 60 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14 520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTOSCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14 640 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14 700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14 760 

ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14 820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14880 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15 000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 



291 



EP0 786 519 A2 



TATATCAAAA GCGTTTGTCC GTTTTCTTTA GTAATCTCAC TATTCGATAC AATTCCGGCT 15240 

ATATCTTCAA ATAATAATGC ATCAACTAAA TCTCTTAATA TTATCGCTTG TGCTGTATTG 15300 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA CCTCGCATTC TTAATATAGG TTCAATGTTG 153 SO 

TCCCAATATT TTGTTGTTGT GCCTGTTGAT AAATAAAATA AGCACTTGAA ATATCTTCGA 15420 

TAGCCATACC CATCGGATTA AGTAATATGA TCTCATCATC GTCTTCACGT CCTGGTATGT 15480 

CACCTGTCAC AAGTTGTCCT AGTTCAGCAT GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 15540 

ACACCAATTG GTTAATAGTT TTCTTTTCTC GATTACATTG TGACCAGTCA TCTACTACGA 15600 

CTTTGTCAGC TTTAATAAAG ACTTCTTTAT GCACATCCAT GATAGAAATG TTGCTAATAA 15660 

ATGCACCCTT TTGTAACCAA TCATATTCAA TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TGACTACTTC ACCATTTGAT ACTGCTTCTT TAGCATTTTC TGTCGCAATA AAATTAATTT 15780 

CCGGACGCTG TTGTTGCCAT CTATCAACAA AGCGTGCACA TGCTTCAGAG AATTGATCGT 15840 

AAACAAACAC GCGTTCAATA TGATCGAATT GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15900 

CGATTAGCCC GCATCCAATG ATTGTTAAGT CTTTAAATCC TTTTTTAGCC AAATGCTTTG 15960 

CTGCAATCAC TGAAACTGCT GCAGTACGCA TACTACTAAT TAAACTTGCT TCCATAACTG IS 020 

CAATTGGATA ATTCGTTTCT GGATCATTCA AAATAATGAC GCCACTTGCA CGCTCCATAT 1S080 

TACGTTTCGA TGGATTGTCG TGCTTACTAC CTATCCACTT AATACCTGAA ATTGCGTGTT 16140 

CACCACCGAT ATGACTTGGC ATTGCAATAA TTCGATCTGC GATGTGTCCA TTTTCAGGAT 16200 

CCtGTCTTAA ATACGGCTTA AGCGGTTGTA CAAAATCATT GTGCGCATGG GCTGTTAATG 16260 

CTTCTGTTAA TGCGTCCACA TAAACTTGTG AATGATTACC TCCCGCTTGT TCAATATCTG 16320 

ATCTATTTAA ATACAACATC TCTCTatTCa TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 163 80 

TTTTTCTAAC CATGTATCTG AATAAACTAA ATCTAAGTAA CGATCGCCTC GATCTGGTAA 16440 

AATCGTGACA ATTGTTGCAC CTTCTTCAAT TGACGTTATC AACTGCTCAA TCGCTGCAAT 16500 

AATCGAACCT GTTGAAcCTC CGGCAAATAT GCCTTCATAA TCAATCAGTT TTCGACAGCC 16560 

CAAAGCAGAT TGATAATCAT CTACATGGAT CACTTGATTA ATTTCTGATC TATTCAATAT 16620 

TTCGGGTACA CGACTAGCAC CGATACCAGG TAATTCTCTA TTAATAGGTT TGTCACCAAA 166 80 

AATGACTGAC CCTTTCGCAT CAACAGCAAC AATTTGTGCG TTTGGATGCA CTTCTTTTAT 1674 0 

TTTTCTACTC ATACCCATAA TGCTACCTGT CGTGCTGACT GGCGCGACAA AATAATCTAT 16800 

AGGTTGCTTA ATTGTTTCAA CAATCTCTGT GCCTGCACCA TGATAATGGG ATTGCCAATT 1686 0 

TAACTCATTC GCATATTGAT TAATCCAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16 920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17 040 

AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 

CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAACCATAG GTGTTTGCCC 17280 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 



(2) INFORMATION FOR SEQ ID NO: 24: 

IS (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 60 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

30 TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 240 

TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

^5 AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 54 0 

40 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 66 0 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 72 0 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 78 0 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 84 0 

so TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 96 0 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 1020 

55 
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ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 114 0 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 12 00 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 13 20 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAGCATTTCT 13 BO 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 144 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 150 0 

ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 156 0 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 16 80 

GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 1860 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 204 0 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCCCATTT 2220 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2340 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 2400 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 2460 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2 580 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2 64 0 

AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 2700 

ATTGTCATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 276 0 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA GTTAGCAACA CCATTGTCCA CGTCTATAAT 294 0 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC GAGACTAGCT TTAGATACTT TTAACACTCG 3 000 

ATTGAATTTA CTGTTATCTG CATTGACGTC AATATTGACA CGTTTCTTTT CTAATTCTGA 3 0 60 

TAATTTAGCT TCTGTTTCAG CGATATCTTT AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

TTGTAAATCT TGTATACTAG CATCTAATTT AGCTTTTACA TTTTTGTTAC TAAAGGCATC 324 0 

TAAAGACTTT TTAGCAACTT TGATAGTTTT TTGTAAATTT TTATCGTTAG CGTTTAATTC 3300 

AACATCTTTA GTTTGATCTG CTACTCGTTT AAATCTTTGC ACAGACTTAA CCGCACTATC 3 36 0 

AATTTGCCTT TTGAATTTGG CTACACTAGC TTCAATAGTC GCTTTAATTT TATATTCCGT 342 0 

CACATTAACA CCTCTCTTTC TATTGCTTAT TAAATTCTGC TATAACTTTA AAGAATTCAT 3480 

TATTTTGTGG TTCGTATTCA TCACGTTCGC TACTAAATCT TATATCTTTA CCTTCGTTAA 3 54 0 

GCCGTTGGAT ATTTTCTTCA TAAGGCAATA CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3 6 00 

CTTTAGGTTT ATTTTCTGTC CCAACATTTT TAGTAGCTGC AGCATCACGA ATAGCAAACG 366 0 

CAAGTTTGTA ACX3TTCGAAT TCTTGGGTTA GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

AGTTATATTC TGTTAATGTC ATTTGCTCAA TAACGTTCAA ATCTGTAATA CCAAGTGTTG 3780 

ACATACAAGT TATAACGATT CTGTCGTAAG TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 840 

TTCCACTACT TCGACTAGGT TTCGGGTCAT AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 900 

CGAACCGAAT TCTTCTAGTC CGATATTTTC TGCGATTTCA TCTAATGCTT CATCAATGTT 3 960 

ATTAATAGTA ATTGCTTGTT TTTTTAAGTG AGATGTAGCT GCGATTAAAA CTTCGCCAAT 4 020 

CACAACCGGA TTTCCACTTT CTAAACCTAC AGGCAACATT GATACACCTT GACCGATAGA 4080 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT ATCGATTTCT CTTAAAAATT TAAAACCAAA 414 0 

ACTTAATTCT AATGACTTTC CGTTAATTTC TACATTCATA ACTTAAAATC TCCATTCATA 4 20 0 

ATTAATTTAA ACAAAATAAA mArGCTTAAC GCCCTATTTT TATACCTCTC TTGGTGCAAC 426 0 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4320 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 4380 

AGGTAGTGTT GCAAATCCAC GTTGGAAACG ACCATTCACT CCATATTCAT ATTCATATTC 4440 

ATCAATACCG TTAGCTTCTG CTTTTAATTC AAATTTATTG TGGAAACCTT GGAAATATTT 4500 

CGCTTTAAAT TTAGCGGAAT CCCCATTTTT GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4560 

TTCATACAAT ACGCGATCTA CAACTGCATC TTCAATTTCA TCTGCAAAAT CGTCACCATA 4620 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4 74 0 

AAGCATTTTA GTAGCATCTA CTTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 800 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4 86 0 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4980 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5040 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 516 0 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 5 2 80 

ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 54 00 

CTGATATTGC GTGATaAATT AGO 54 23 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHAJRACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : double 

(D) TOPOLOGY; linear 



3S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 18 0 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 24 0 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

so TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 480 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 54 0 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 

55 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 720 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 840 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 90 0 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 960 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 1020 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 1080 

TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 114 0 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 1200 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 12 6 0 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 13 20 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 144 0 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 1620 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 1680 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 18 00 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 18 60 

ATAWVCGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 1920 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 2 040 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCT^CAG 2220 

TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 22 80 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2 34 0 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 24 00 



297 



EP 0 786 519 A2 



TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2 580 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 2 64 0 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

GCTTTAGTAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2 820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2 940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3 0 60 

GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 312 0 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 3 300 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 3480 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3 54 0 

AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3 6 00 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 36 60 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3 720 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 3 84 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3 900 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 3 960 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4 020 

TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4 080 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 414 0 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4 200 
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AACATCAATT CAAACTTTAG AAGATGTGAA 
TAGTAACAGG TGCATCAAGA GGAATTGGAC 

5 

GATATAATGT AGCAGTAAAC TATGCAGGCA 
AAATCAAAGC TAAAGGTGTT GACAGTTTTG 
AAGTTAAAGC AATGATTAAA GAAGTAGTTA 

10 

ATAATGCAGG TATTACTCGC GATAATTTAT 
ATGTTATTGA CACAAACTTA AAAGGTGTAT 

,5 TGTTAAGACA ACGTAGTGGT GCTATCATCA 
ATCCGGGACA AGCAAACTAT GTTGCAACAA 
CGGCGCGTGA ATTAGCATCT CGTGGTATCA 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG 
CGTTAGCACG TTTTGGTCAA GACACAGATA 
ACAAAGCAAA ATATATTACA GGTCAAACAA 

25 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG 
GACCTAGTCA ACTTTGCGGG GGAAATTCTA 

^ CCTAAGAAAC ACTAATCAAT aAATTGwTAA 
AATTTAAAAT GGGAAAATAT AGTAGTCTAT 
CGTGGAAAAT TTCGATAAAG TAAAAGATAT 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA 
TGAATTAGTA ATGGAATTAG AAGACGAGTT 
AAAAffrCAAC ACTGTTGGTG ATGCTGTTAA 
CTTACATCTG GGTCGTCAGT ATTGTCGACT 
AACGTAAAAT TAAAGATGAT TCAAGAGCAA 
AAAAGAAAAG TGAGATAGTT AATCGTTTTA 
TAGGCTTTAC TTATCAAAAT ATTGATTTAT 
TTAATGATTT TAATATGAAT CGTTTAGACC 

so CGGTATTAGA ATTGACGGTT TCACGATATT 
GGAATTTAAC AAAAATGCGT GCCaCTATTG 
ATAAAATTGG ATTGAACGAA ATGATTTTAC 

55 



AGGATGGAAT GAAAATGACT AAGAGTGCTT 4320 

GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 3 80 

GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 44 4 0 

CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

TAATGCGTAT GAAAGAACAA GAGTGGGATG 4 620 

TTAACTGTAT CCAAAAAGCA ACACCACAAA 4680 

ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4 740 

AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4800 

CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 860 

AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

TCCATGTAAA TGGTGGAATG TACATGTAAT 504 0 

GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5150 

GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

GTATAGGCAT TTTTAAAGGA GGTGAATCGA 52 80 

CATCGTTGAC CgTTTAGGTG TAGACGCTGA 5340 

TGATTTAGGC GCTGACTCAC TTGATATCGC 54 00 

TGGTACTGAA ATTCCTGATG AAGAnGCTGA 54 60 

ATTTATTAAC AGTCTTGAAA AATAATAAAT 5 520 

CAGTTTTTTT CTTTAATTAT CAATAGTTTT 55 9 0 

CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

TATTTGATAa ACATCCCAAC TTGCCAGAAG 58 8 0 

TATGTGAGCC CtCACTkGTA ATATTTGCGA 594 0 

TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 612 0 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC G18 0 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 624 0 

TATTCACTTC A 6251 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 6 0 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TGATACAAGC TCAAAAAGCT GGTGAAGAAA 180 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

ATCAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT AGCATTCCAG ACTGAAGATA 300 

TGAAACGTCA ATCAAAAGTA TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 360 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 420 

TGACGCTTGA AAATATTCAT CATTTGCATG AAAATGATTT AAAGCCAGAT GAAGTTGCAG 480 

CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 54 0 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 6 00 

CAGGGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 6 60 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

GCAATCGTCC CTTTTAATTT AACTTAGAGT TTTTTAAATT TTTAAGGAGT GAAAAAAATG 780 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 840 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 960 

AACTTACATA TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACTATGCAAG GGTTCTATGC ACCATACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 10 80 

GAACAAGCAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 114 0 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 126 0 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 132 0 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 156 0 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TXSGaTTGGGA TAAAGCATCA 1620 

ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 174 0 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 192 0 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 198 0 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 204 0 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 2160 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 234 0 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 24 00 

AGAGCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 24 6 0 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2 580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 2640 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT AOSATTAACA ACTATGAAAA CTTTGACTAC 2880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2940 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 306 0 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3240 

GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 33 00 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA 3 360 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 3420 

GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 34 80 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 354 0 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 3 600 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACGAAGT TACCGGTCAT 3660 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 378 0 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 384 0 

AAACTTTGTC AATGGTATCA CTTTAAATGT GAGAGATAAG AATGAATTAA AGCCATTTTA 3 900 

TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3960 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4020 

GTCCXSAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4080 

TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 4140 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4200 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 42 60 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 43 20 

TATAGGTGCA TTGCATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 4380 

TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 4440 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 560 

TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4620 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4 680 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 474 0 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 486 0 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4 920 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



IS (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 6 0 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 12 0 

^° TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 24 0 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

2S 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

3P ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 480 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 54 0 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 60 0 

35 TCACTAGAGG AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

r(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base palrS 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

4S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28; 
nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 
so AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 
GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 18 0 

TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 24 0 

ss 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA 
AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA 
ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA 
ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG 
TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC 
CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT 
'5 AAGCAATGAA GCTGATAAAT ATGACGAACA AGCATTATTA 
TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT 
AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT 
TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA 
AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG 

25 

GAaGGTAGTT ATTACAGCAC AAACtnATTAA TGrAGaUVCTG 
(2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATTGftCTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 
GTTGGTCAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 
GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 
AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 240 
AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 300 
GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 
50 AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 420 
TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 54 0 
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GATGCGACAA GTTTAACTGA 3 60 

AGATTAATTC AAGCAGCTGA 420 

GATGAAATT3 ATAAAATTGC 48 0 

GGTGAAGGTG TTCAACAAGC 540 

CCACAAGGTG GACGCAAACA 600 

TTATTTATTC TTGGTGGTGC 660 

GAAAAAGTTA TTGGTTTCTC 720 

GCACAAATTC GCCCAGAAGA 780 

GTGCCAATTG TAGCTAATTT 840 

CAACCTAAAA ATGCACTTGT 900 

TTAGAGTTCA CTGAAGAAGC 960 

GGTGCGCGTG GTTTACGTTC 10 2 0 

CCTTCTAACG AAAATGTAAC 10 80 

AACCAG 1126 
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CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT G60 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 

S 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 78 0 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 84 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 

TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

IS TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 1080 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 114 0 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 12 0 0 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 126 0 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 13 2 0 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 1380 

25 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 14 40 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 

30 TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 1560 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 1680 

CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 174 0 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 1800 

GGTMTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 186 0 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 1980 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2040 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCX3CTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

SO TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AGCAATATTA TTGAAAGTGG TAATGGATTT 22 8 0 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 234 0 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATGCGTTC TGAATTTCAA 24 60 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAACCGAAAC TAAAGTTGAA 2520 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT ATTCCATCAA TAACAAAATT GCATCGCTCA 25 BO 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 264 0 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA GAAAAAGCAA 27 00 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 2760 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2820 

CTCGAAGTGG GGCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 2880 

CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC GGAAGATAAG 2940 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTCAA 3000 

CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3060 

GAAGGTTATC AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 3120 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 3180 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 324 0 

GTGAGAGCAG CATTCAAAGC TGTAATGGAA GGAAAGCAGG TTGCATTTTT AGTTCCTACA 3300 

ACTATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 3360 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 34 20 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 34 80 

CAGTATAAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG TGTACGCCAT 3 540 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGTCAGT GATTGAAACG 36 60 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT GAGTTTTATC 3720 

AAAGAAGCTT TAGAAAGAGA ACTATCCCGT GATGGCCAAG TGTTTTATCT TTATAATAAA 3780 

GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATGCCAGA TGCTAACATT 3840 

GCAGTTGCTC ATGGACAAAT GACAGAGCGC GATTTAGAAG AAACGATGTT AAGTTTTATC 3900 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT CGATGTCCCA 3960 

AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTATCAA 4020 

TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 4080 

AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 414 0 
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TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 



4260 



TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 



4320 



GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 



4380 



GCTAAAATTG AA 



4392 



15 



20 



25 



30 



35 



40 



SO 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 60 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 12 0 

TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 18 0 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 240 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 

CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 360 

TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 420 

TTGCAGTTTG ATAACCGATT TCTAAATTTT TTACATGCAT GACGTCATTA CCTGTATTCC 480 

GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 66 0 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 72 0 

AGCGTTTGA 729 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 120 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 180 

TGTCCTTCAA TATCAACTCG TGGAATAATG CX3ATTCAACC ATGCTGGTAA ATACCACGAA 24 0 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 3 00 

AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT 360 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCGCTT 4 20 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 4 80 

ATTCGCGACA TAAGGAAGAC TTCATAATCC ATCGCTAATC CAAATAAGAT ACCTATAGTA 54 0 

ATAACCGGTA AAAATGCTAG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA 600 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG CCATTAATGA CAAGACGAAT 660 

CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 720 

GCTAATACAA CAATGACTGA GGCAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 780 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA ACCCATATTT ATCTTGTGCA 84 0 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGGCCCT 900 

TGCTTAGGTA TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGGCGTA 960 

ACGATATCTA CATTTTTCTT ATCTTTAATA TCTTTATATA CAGACTGTAA ATCTTGTTGT 1020 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 1080 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 114 0 

GGTTTAACAC CGTCATCTGG AATACCAAGT CGCATATGAC TAACTGGTAT TGCAGCTGCT 1200 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 1260 

CATGGCGTAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 132 0 

TSGAAAATGC TTATTAATGC AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 138 0 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 14 4 0 

CATACTGCAA TTACAACTGT TACACCAGCA AAAACAACTG CACTACCTGC TGTTCCTATT 1500 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 1560 

AAAATAAATA ATGCATAATC GATACCAACT GCTAGTCCAA TCATTACGGC TAATGTCAGT 1620 

GTGACATTTG GTATATCGAA TGCATAAGTT AACAAACTGA TAATACCTAC ACCAGAGGCT 1680 

AGACCAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 174 0 

AACAGTACAA CAAATGCAAC AATAATACCA ACTAGTTCAG AATTACCGCC TACTTCTGTA 18 00 
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AAATGACTTT TAACATTATC TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT 1980 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 204 0 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGTAGC AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TATCCATGCA ATGACCGCGG ACCATTTATG TTTTGCGATG 2280 

AATGTCCCCA TCTTATATAA AAATTTTGCC AAAGTATATT GCCTCCTTTT AAAATCAACG 2340 

TTATAGTTTA AATATACAGT GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 2400 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATGA 2460 

AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 2 58 0 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 264 0 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AATCATCCAT 2700 

TCCAAACGAT GAGTGATACG ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 2760 

AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATTATGCATA 2820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2880 

TATTTTATAT TTATGACTCG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 2940 

TTGATTGGCC TGGCGAAGAT ATTGATAACA TTTTCCATAG ATTAATCAAT ATTAAGATTA 3000 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 306 0 

CGTTAATGAA TTCGTGTACT ACTATCGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 3120 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT CCCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTTATACC TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 3300 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 3420 

TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 3480 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATTCGTTTA CCGCTTTAAT 354 0 

CCAATGCTCG CGATTATTTG TAATCACTGC ACCAGTTAAA CCGTAATCTG TATCATTTGC 3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3720 

GTAACCTTTT GAATCATCAG TGCCGCCACC TTGTTCTAAT TTACCTTCTT CTTTACCAAT 3780 

CTCAATATAA TTTTTAATCT TATCAAATTG TTTTTTATTA ATAACTGGGC CCATATACGT 384 0 

ATTGTCTACA GTATTGCCCA ACGTTAATTC TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

TTCGTCATAA ACGTCTTTAT GCACAATTGC ACGTGAACAT GCTGAACATT TTTGACCAGA 3960 

AAAACCAAAT GCTGACGTTA CAATAGCTTC TGCTGCCATA TCTGTATCAA TATTTTCATC 4020 

AACTACAATG GCATCTTTAC CACCCATTTC AGCGATAACA CGTTTCAAGA AGTTTTGACC 4 080 

TTCTTGAACA ACGGCACTAC GTTCATAAAT TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TGTAACGAAA TGCGTATCTT TATGATCAAC TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

AGGAACAAAG TTAACTACGC CTTTTGGTAA TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

ATAAGCGATA TAAGGTGTAT CCTCAGCAGG TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

TGGTGCTAAA GTTGTACCAG CCATAATCGC AAACGGGAAG TTCCACGGCG GAATTGTAAC 43 80 

ACCTGTACCA ATTGATTTAT AGAAATATTT ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

CTTACCTTGA GCCAAGTCCA TCATTGAACG TGCATAGTAT TCAATAAAAT CAATAGCTTC 4500 

AGCTGCATCA CCAACTGCTT CATCCCATGG CTTACCTGCT TCATAAACCA TAATTGCTGC 4 560 

AATTTCCGCT TTTCGACGAC GAATAATTGC CGAAACACGT AACATAAGCT CTGCACGATC 4620 

ATTTGCTGAC CATGTTTTCC AAGATTTATA AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4 6 80 

AACATCTTGT TTTGTTGCCT TTGATGCATT TGCAATCACT TGTGATGTGT CTGCAGGATT 4 740 

GATTGATTTA ATTTTGTCAT CTTTGAAAAT CTTCTCTCCA TTAATCACTA ATGGTATGTC 4 800 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA TGCTTTCTTA AACATATCCA CATTTTCTTG 4 860 

GACTGAAAAA TCGTAACCAG GTTCATTTTT AAATTCTACT ACCATGTACA CTTACCCCCT 4 920 

ATAAATTTTG AAAGTGGTTT AACCCTTTGA TTTAATGATA TAACATCATT TAAACTCATT 4 980 

TTACTATGAT TAAGGTTAGT TTTGCAATCG CTTTCATTTT TATGTTTTAT CACTTATTCT 5040 

CAAGTATTTT GAAATTGATT GGTTACTTTT TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TATCGTTTCG TCATTTAATG TTTCGGATGG TAGGTCATTA TCAATTTTAC GAACGACTTT 5160 

ACAAGGGTTT CCAACCGCTA AGCTGTGTGG CGGAATATCT TTAGTGACAA CACTACCAGC 5220 

ACCAATCACA CTGCCTTCTC CAATCGTCAC CCCTGGTAAC ACGGCTACAT GACCGCCAAA 5280 

CCAAGTATTA CTGCCAATAT GAATGGGTCC GGCTTTTTCA AAACCTTCAT TTCTATGATG 5340 

GAAATTAAGT GGATGTGTCG CTGTGTAGAA TCCACAATTA GGTCCTATAA AAACATTATC 54 00 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 5520 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 558 0 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 57 00 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 594 0 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 624 0 

ATGTTTGCAC GGCAATCTCT CTTTTTCnT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 6420 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 6480 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTpVAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 678 0 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 6840 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6 900 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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TTAAAAGATG GTAATGAAGT GATGATTCCT CTAAATGAAG TACATGTTGG AGATACACTT 73 20 

ATCGTTAAAC CAGGTGAAAA GATACCTGTT GATGGCAAAA TTATTAAAGG TATGACTGCC 7380 

^ ATCGACGAAT CTATGTTAAC AGGTGAATCT ATCCCTGTTG AGAAGAATGT TGATGATACT 74 40 

GTAATTGGTT CAACGATGAA CAAAAACGGT ACTATTACTA TGACAGCAAC AAAAGTTGGC 75 00 

GGGGACACTG CGTTGGCAAA TATTATTAAA GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 7560 

\o 

CCGATTCAAC GATTGGCAGA TATTATTTCT GGTTATTTCG TTCCTATCGT TGTTGGTATC 7620 

GCACTATTAA CATTTATCGT GTGGATTACT TTAGTTACAC CAGGTACATT TGAACCTGCA 7680 

,5 CTTGTTGCGA GTATTTCCGT TCTCGTCATT GCTTGTCCAT GCGCATTGGG ACTTGCTACA 774 0 

CCAACTTCTA TTATGGTAGG TACTGGTCGC GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 7800 

GGCGAGTTTG TTGAACGCAC ACATCAAATT GATACCATCG TTTTAGATAA GACGGGTACC 7860 

^° ATTACAAATG GTCGTCCAGT CGTGACAGAT TATCATGGTG ACAATCAAAC GCTACAACTA 7920 

CTTGCTACTG CTGAAAAAGA TTCTGAACAC CCATTGGCAG AAGCCATTGT CAATTATGCA 7980 

AAAGAAAAGC AATTAATATT AACTGAGACA ACAACATTTA AAGCAGTACC TGGCCATGGT 8 04 0 

2S 

ATTGAAGCAA CGATTGATCA TCACCATATA TTGGTTGGTA ACCGTAAATT AATGGCTGAC 8100 

AATGATATTA GCTTGCCTAA GCATATTTCT GATGATTTAA CACATTATGA ACGAGATGGT 8160 

AAAACTGCTA TGCTCATTGC TGTTAATTAT TCATTAACTG GTATCATCGC AGTGGCAGAT 8220 

30 

ACTGTCAAAG ATCATGCCAA AGATGCTATA AAACAATTGC ATGATATGGG CATTGAAGTT 8280 

GCCATGTTAA CTGGCGATAA TAAAAACACT GCTCAAGCCA TTGCAAAACA AGTAGGCATA 8340 

35 GATACTGTTA TTGCAGATAT TTTACCAGAA GAAAAAGCTG CACAAATTGC GAAACTACAG 84 00 

CAACAAGGTA AGAAGGTTGC GATGGTTGGT GACGGTGTAA ATGATGCACC TGCATTAGTT 8460 

AAAGCTGATA TCGGTATCGC CATTGGTACA GGTACAGAAG TTGCCATTGA AGCAGCTGAT 852 0 

''^ ATTACTATTC TTGGTGGCGA CTTGATGCTT ATTCCTAAAG CCATTTATGC AAGTAAAGCA 85 80 

ACCATTCGTA ATATTCGTCA AAATCTATTT TGGGCATTCG GCTATAATAT TGCCGGTATC 864 0 

CCTATAGCTG CATTGGGCTT ACTTGCGCCA TGGGTTGCTG GTGCTGCAAT GGCACTAAGT 87 00 

45 

TCAGTAAGTG TTGTCACAAA CGCACTTAGA TTGAAAAAGA TGCGATTAGA ACCACGCCGT 8760 

AAAGATGCCT AGATTCCTTA ATAATGAAGG ATTCGTTGGT GATTCTGAGA TAGGCTAGTG 8820 

5^ ATTGGCTCTA TAATGTCGCG GTTTAyaGTt GGATCTTCGC TCCAACTGCA TATATAGTnA 8880 

CACTTTTCGC TTGGCGAATT AGTGTATCTT ACCTAATAGc TCCGCCTATT AGGTTCCATC 894 0 

ATTATTATAA ATAATAAGTA CACTACGGtT TACAGTTGGA TCTTCGCTCC AACTGCATAA 9000 

55 



312 



EP0 786 519 A2 



GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 918 0 

GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 92 4 0 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 9300 

AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 9360 

AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 94 20 

TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 80 

TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 954 0 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 960 0 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC TIAATACTGCG 966 0 

CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 972 0 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 97 8 0 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGtCGTA TGCGACCACA TCACTTTGAT 9840 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 996 0 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 1002 0 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 1008 0 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 10200 

AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 1026 0 

CTACerCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 10320 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 10 380 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 1044 0 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10 50 0 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10 560 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 10680 

TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 10740 

TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10800 
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CCATTTTCCG AAGCAAGTAT ACTAGGATTT TTAGCGTCGA AACCGAAAGC ACCATAAGCA 10920 

AAATCATGCA CGATTTTAGT GTCTGTACCT TTAAATTTAG cTATCGCTTC ATCAAAAACT 10980 

5 

TCTTTCGTAG CTGTCGATCC AGTTGGATTA TTTGGATACG TTAAATAAAT GAGTTTTGTT 1104 0 

TTATCTATTA TTTGTGAATC AACTTTGGAC CAATCTGGCA AATAATGTGG CGGTTCTAAA 1110 0 

TTAAGCGGGA CTGGCTTGCC ATCAGCTAAA AGTACACCTG CTAAATAATC CGTGTAGCCT 11160 

10 

GGATCAGGTA GTAATACATA GTCTCCTGGA TTGATAACAC ATGTTGGTAC TGCCACTAAT 11220 

CCATTTTTTG TACCATATAA AATGCATACT TCATCTTCTT TATCTAACGT CACATTATAT 11280 

15 TGTCTTTGAT AAAAATCTAC AATAGCTTGC TTGAACGCTT CTTTACCATG AAAAGCACCA 1134 0 

TATTTTTGAT TTTCAGGAAT AGTTAGTGCT TTTTGAAAAT GATCAATAAT ACCTTGTGGC 114 00 

GTGGGCCCAT CAGGGATTCC AACTGCCATA TTAATTAATG GCAATGGTCC ATGTTCGATT 114 60 

^° TTACGTCCCA TCGTTTTCCC GAAATAACTA TCAGGGATAT TTGCTAATTT GTTAGAGATC 1152 0 

ATCAAATTCC TCCTCTATCA TTAAACATAG CCTGGGCGAC TATCATAATC CTAACAACTT 115 80 

GTATCACTCT CATTTAGATG GTTACAATGA CATCGCCATT CACCGTTATG TTCAACAGAA 11640 

25 

CTTATGACAC ACGTTGTATT GAATGAATTT ATTTTCATTT TAGGTAGGTA TAATATTATT 11700 

GTCAATATTA GGAATTTTCA GATTAATATG CACTCAATCG TTATGATTTA ACTGTCATGC 117 60 

ATATCCGCAT GCGCAACCAG TTAGATATGC TTATATAAAG TATAACGCCC ATCAAGGTAC 11820 

GTATTCAAAC GTGAACCTTA ACAGGCGTCA TTCATTGTTA AATAAAACTT CTTAAGCACA 11880 

TACTTATTTC ACTATGCCTT TTACGTTCCC CTTATACTTT TCTCACATCT TTCTCTTAGA 11940 

35 CTACTCCCTT ATACGCCCCG CTCAATATCT TTAATCATTT CATCTACAGT TATTTTCGCA 12000 

CTCGTTAAGA CAATAGGAAC GCCTGCACCT GGATGCGTAC TTGCACCTGC AAAATATAAA 120S0 

TCTTTATAAT CTCGCGATAC ATTTTGTGGA CGATAATAAT TACTTTGCGC TAAAGTTGGC 1212 0 

""^ ATTAAACCGA ATGCCGAACC AAATTTCGCA TGATACGTTT GCTCAAAATC ATTTGGCGTA 12180 

AAGATTGTTT CTGAAACAAT ATGCGATTTT ATATCTTCAA ATACTTCAAT CGTTGCTAAT 12240 

TTACGATAAA TAATTTCCTT TATTTGTTGC GTCAAAGCTT CATCTGACCA ATCGATTCCG 12300 

45 

CTACCTGTTT TAAGTTCCGG CGTCGGCATT AGCACATAAA TACCAGTTTT GCCTTCTGGC 123 €0 

GCAAGTGATT TATCAGCGAC CGCTGGTACA TACACATAAA TAGAAGGATC ATATGATAAA 12420 

SO CGTCCCTCAA ATATTTCTTC AATATTGCCT CTAAAGTCAT CTGAAAAAAT AACATTATGA 124 8 0 

AGTCTCACTT GATCTGTCAC ATCAATATCT ATACCGATAT ACATTAAAAA TGCTGAACAA 12540 

GAGTAATCTA AGTCTGCAAT TTTATGTGGT GGATACTTTT TAATAGGTGC AAAATCTGGC 12600 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 

s 

TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA 12 840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACX3TGCAA TTTCATATTT TTTATAAACA 13080 

,5 TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 13140 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

^° AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 1344 0 

25 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13 680 

TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13 740 

35 CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13800 

CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 
(2) iNFORMATION FOR SEQ ID NO: 32; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

50 ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 

AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT IBO 
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ATAATTGGTT 


AATATATGAG 


TAATTAGAAA 


ATAGACAAAG 


GATGACGATT 


TATGTATATC 


300 


AATATGAAAG 


ATTATGGGTT 


AACAGGCATA 


AACAAAACTA 


AAGATACTCG 


AGCAATACAA 


360 


CGTGCGTTAA 


ATCGTGGAAG 


ATGTAAACCA 


ACGACAGTTT 


ATATACCGAA 


AGGGACGTAT 


420 


GATATTTGCA 


AACCATTAAC 


GATATATGGC 


AATACAACAC 


TTTTGTTAGA 


TAATGAAACT 


480 


ATTTTACGCC 


GATGTCATTC 


TGGTCCTTTA 


TTAAAAAATG 


GTCGTCGCTT 


TGGTTTTTaT 


540 


CGTGGTTATA 


ATGGACACAG 


TCATATTCAT 


ATTAAAGGCG 


GCAAGTTTGA 


TATGAATGGT 


600 


GTATCGTATC 


CTTATAACAA 


TACAGCTATG 


TGCATTGGGC 


ATGCTGAAGA 


TATTCAATTA 


660 


ATAGGTGTGA 


CCATTAAGAA 


TGTAGTGAGT 


GGTCATGCAA 


TTGATGCTTG 


TGGGATTAAC 


720 


GGACTCTATA 


TTAAAAGCTG 


TTCATTTGAA 


GGATTCATAG 


ACTATAGTGG 


CGAACcTTTT 


780 


ATTCTGAAGC 


AATACAATTA 


GACATTCAAG 


TACCTGGTGC 


TTTTCCAAAA TTCGGAACgA 


840 


CAGATGGTAC 


GATAACGAAA 


AATGTCATTA 


TCGAAGATTG 


TTATTTTGGA 


CCTTCAGAAT 


900 


TGCCCGAAAT 


GGGAAGTTGG 


AATCGTGCTA 


TTGGCTCACA 


TGCAAGTAGA 


CATAATCGAT 


960 


ACTATGAGAA 


TATTCATATT 


AGAAATAATA 


TATTTGAAGA 


TATACAAGGT 


TATGCATTAA 


1020 


CTCCCTTGaA 


GTATAAAGAT 


GCTTTCATTA 


TTAATAATAA 


GTTTATTAAC 


TGTGaGGGTG 


1080 


GCATTAGATA 


TTTAGGAGTT 


AGAGATGGTA 


AAAATGCAGC 


AGATGTGaTG 


ACAGGaAAAG 


1140 


ACTTAGGTTC 


CCAAGCAGGC 


ATAAATATGA 


ATATAATTGG 


AAATGAATTT 


AAAGGATCAA 


1200 


TGTCTAAAGA 


TGCGATACAT 


GTACGTAATT 


ATAATAATGT 


TAAACATAAA 


GATGTATTAA 


1260 


TCGTTGGGAA 


TACATTCAAT 


AATTCGACTC 


AATCAATTCA 


TTTAGAAGAT 


ATTGATACAG 


1320 


TGTTTTTAAG 


TCCTGTTGAA 


GCGGGTATTC 


AAGTTACTAC 


AATCAATGTA 


GATGAAATAA 


1380 


AAAAGTAAAA AGTTTCGCAT 


GACATTAGGA 


TTAAGAATAG 


TAGATAATTT 


TTGAAAGCGC 


1440 


ATTGATAAAA 


CGGTATAAAT 


ATGCTATAAT 


AAACCCAATT 


ATCTGATAAA 


AGGGGTATTT 


1500 


TGACGGTAAT 


GATAATACAA 


GATAGACAAC 


TTTCTATACT 


CTAATATAGT 


GAGTTGAAGT 


1560 


AGCTTGTCAT 


AATCATCATG 


AGGGGGAAAT 


TTATGGCTTA 


TTTCAATCAA 


CATCAATCAA 


1620 


TGATATCGAA 


AAGGTATTTA 


ACATTCTTTT 


CAAAATCAAA 


GAAAAAGAAA 


CCGTTTAGTG 


1680 


CGGGACAACT 


TATTGGACTA 


ATATTAGGTC 


CATTACTTTT 


CCTATTAACA 


TTATTATTCT 


1740 


TTCATCCACA 


AGACTTACCT 


TGGAAAGGCG 


TCTATGTTTT 


AGCGATTACT 


TTATGGATTG 


1800 


CGACTTGGTG 


GATTACTGAA 


GCAATTCCTA 


TTGCAGCAAC 


GAGCTTATTA 


CCAATTGTGT 


1860 


TATTACCATT 


AGGTCATATA 


CTTACACCAG 


AACAAGTATC 


ATCCGAATAT 


GGCAATGATA 


1920 


TTATCTTTTT 


GTTTTTAGGT 


GGATTTATTT 


TGGCAATTGC 


AATGGAAAGA 


TGGAATTTAC 


1980 



316 



EP 0 786 519 A2 







. GGATTCTTAT 


CTATGTTTGT 


ATCGAACACT 


GCAGCTGTAA 


2100 


TGATTATGAT 


TCCGATTGGT 


TTAGCAATTA 


TTAAGGAAGC ACATGATTTA 


CAAGAAGCCA 


2160 


ATACGAATCA 


AACAAGTATT 


CAAAAGTTTG 


AAAAATCTCT AGTTTTAGCA 


ATTGGCTATG 




CAGGTACGAT 




GGTACATTAA 


TCGGAACCCC 


GCCATTAATT 


ATTTTAAAAG 


2280 


GACAATACAT 


GCAACATTTT 


GGACATGAAA 


TTAGTTTTGC 


TAAATGGATG 


ATTGTAGGGA 


2340 


TTCCAACGGT 


CATTGTTTTG 


TTAGGTATTA 


CTTGGCTCTA 


TTTAAGATAT 


GTTGCGTTTA 


2400 


GACATGATTT 


GAAATATTTa 


CCTGGTGGTC 


AGACGTTAAT 


TAAACAAAAG 


TTAGACXSAGC 


2460 


TTGGCAAAAT 


GAAGTATGAA 


GAAAAGGTAG 


TACAAACTAT 


CTTTGTACTT 


GCTAGCTTAT 


2520 


TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT 


GTTGCAGATG 


2580 


GTACGATTGC 




TCAATATTAT 




TCCAGCTAAA 


AATACTGAAA 


2640 


AACATCGCCG 


TATCATTGAC 


TGGGAAGTTG 


CAAAAGAGCT 


CCCTTGGGGT 


GTATTAATTT 


2700 


TATTTGGTGG 


CGGTTTAGCA 


TTAGCGAAAG 


GTATTT CTG A 




GCAAAATGGT 


2760 


TAGGCGAACA 


GTTGAAATCA 


TTAAATGGTG 


TTAGTCCGAT 


TCTTATTGTA 


ATTGTCATAA 


2820 




CTTATTTTTA 


ACTGAAGTGA 


CATCTAATAC 




ACGATGATTT 


2880 


TAC CG ATTTT 


AGCAACGTTG 


TCTGTTGCTG 


TTGGAGTGCA 


T C CATT ACT A 


CTTATGGCAC 


2940 


CTGCAGCTAT 


GGCGGCTAAC 


TGTGCATACA 


TGTTACCAGT 


AGGGACACCA 


CCGAATGCAA 


3000 


TTATCTTTGG 


TTCTGGTAAA 


ATATCTATCA 


AACAAATGGC 


ATCAGTAGGA 


TTCTGGGTAA 


3060 


ACTTAATCAG 


TGCAATAATT 


ATTATTTTAG 


TCGTGTATTA 


TGTAATGCCT 


ATAGTTTTAG 


3120 


GTATTGATAT 


AAATCAACCA 


CTGCCATTGA 


AATAGTAATT 


GCAGATTAGA 


ACGAAAAATA 


3180 


AAAGGTTACA 


TTAGCAATTG 


CTTGGACGAG 


TGGTAACGAA 


ACGTATACCG 


CAGCATCGTG 


3240 


TAASAACAAT 


ACAAACAAAA 


GAAAGTCAAC 


CAAGGATGGA 


TTCCTATTTT 


AATCCTTGGT 


3300 


TGACTCTTTA 


TTTTATTTAA 


ATTGTAGAAC 


CTAGAAAATA 


AAGTTTAATT 


AAAAGCACCA 


3360 


ATCATTTCTA 


CTTTGAAATC 


TAAGGTTTCT 


AAAATAGCAA 


TGACTTTCTT 


TATATCGGTT 


3420 


GTAATTGCAG 


AATCAGCCTG 


AACGAAAAAT 


CGATACATAC 


CTAATTGTGT 


TTTTAAAGGA 


3480 


CGAGACTCAA 


TCCAGGATAA ATTAATATTA AACAAAGCAA 


ATGTATTAAG 


CACACTTGCT 


3540 


AACAACCCAG 


GTTTATCATG 


CATTGGTGTA 


ATTAAAAACA 


TCAATGATGT 


CGCATTTTGA 


3600 


TCAAATTGCT 


GCTGATTTTT 


TATAACTAAA 


AAACGTGTCA 


CGTTATGTGG 


ATAGTCTTCA 


3660 


ATATGTGTAT 


CAATAGGTGT 


AAAACCATAA 


GctTCGCCAC 


TACCTAAAGG 


TGCAATTGCT 


3720 


GCAACGCCAT 


TTTCAATTTT 


AGTCAAACTT 


TGAATTGTAC 


TGTCGACATA ATCATAGTCA 


3780 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA 
CGTATTTCAC CGTGTGCAAA GACATCTTGC 

5 

GTTCCTTCTA TAGAATTTTC AATAGGGACA 
GCCTTGATGA CTTCAAATAA ATTTGACTTT 
TACTGACGAC AAGCCAAATA TGAAAATGTA 
TGCTACACCT CTACTAACTT AATGATGGAA 
TTTATAGAAA AAGTTTGGAT CTTTTACTGT 
rs CAATGTTGGA GATAATGGCG GTGCTAGCCA 
TTGTGCTTCG TTACGTTCGA ATAGTTCGAA 
TGATACGCCT TCTTTTTTAA AGGAATGATA 
20 ATCTTTATTA AATGAATTAT TTTTAAGTGT 
TAGTAAATTG TAACGGTAAT CATCAATAAA 
CCAACCGTTA AAGGGTGCAG TTGGATATAC 
AATGATAGGG ACTGCATACC ATTTTAAGTT 
TTCAATAGGT ACTTCTTTAA TTAATGAAGT 
^ AGGTAATTGG TAAATCAGTG GTAACACGTC 
GTGATTTGCT AAGCGTGTAA CTTCTTTTTC 
ATAGCGAATT AATTGATTGT TGAAAATTTT 
35 ATACGGCTTT AATTTACCTT CATTTGTAGC 
GAACGATGCT TCGTCAGTAA CATCTCTTGC 
ACGACCAATG CAACGATTTG AATTACGCCA 
TTCTGTATGT GTATATGTCC CAGTTTCTTT 
ATTGATAATT TGCGTTTCAT AATGACACTC 
CTCTTTAAAT AACATTAACA ACACCTCGCT 

45 

CTTAAAAATT ATGTATATGT CATTAAATTG 
TAAGGGGCTC TTATGTATAT AAAAAAATGA 
SO TCGATTGGAG AGATACAAGT GTACCAATTA 
TTATATCAAA TAAATATGGC TGAAAGTTAT 
GTATTTGATT TGTATTTTAG AAAAATGCCA 

55 



TTACCATATA ATGCAAAGTT AATATCTAAA 3 900 

TGTGCAAGTG CATCTGCCAC AATGTTGATT 3 96 0 

ACACCAATCG ATGTGTCATC ATCTGCAACT 4 02 0 

GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4080 

CCTTTAGGGC CTAAATAATA TAATTGCATA 414 0 

AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4 200 

ATTGTCATAT CCGTGATGAT AATTTGACGT 4 26 0 

AGACCATTTT CCGGTAACTT GACGACCTTG 43 20 

TTGCTTTGCA GCGGTCAAAT CSATCGACAAT 4 3 80 

CACAGCATAG TTCAATTCAA CAAGTGCTCG 444 0 

ATCAAATTCA AACGCATCTG CAACTTTTTC 4500 

GTTACGTACG CCAATTTCAG TTACCATATA 4 56 0 

AATGCCACCG ATTTTTAAGT CCATATTGGA 4 620 

CAATTTTCTT AATTTTGGAT AATGATTATG 4 680 

AGGATATTCG TAAAATTTAA CTGACTCATT 4740 

AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4 8 00 

AGCAGGATCA CCACAATTGT CATAGCCAGC 4 860 

AGGTCCATCC TTTGGAGCAT ATATAGTAAT 4 920 

CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

ATCAATGACA TTTAACGAAT CCCAAAATAA 504 0 

AGCCATTTTA GCACCATAAA TAAGTTCTTC 5100 

TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

TTTATACATG TTTTCTATGA AAGCTTGAGC 522 0 

TTATATTATA GTCTACATTA TTAAAATACT 5280 

TTGGTTGATT TTAATTAAAA GTATGGAAAT 534 0 

ATTATGATAA AATGTAAGAA AATATTTAGG 5400 

GAAGACGACA GTTTAATGTT ACATAATGAC 5460 

TGGAATGATA ATATTCATGA AAAAATGGCT 5520 

TTTAATAGTG GCTATGCTGT TTTTAATGGT 5580 
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TTAAAGTCTA 


TTGGCTACAA 


GGATGATTTC 


TTATCATATT TAAAAGATTT 


AAAATTCACA 


5700 




GGCAGCATCC 


GTTCGATGCA AGAAGGCGAA 


TTATGCTTTG GTAACGAACC 


ATTGTTACGC 


5760 


5 


GTAGAAGCAC 


CATTGATTCA 


AGCGCAATTA 


ATAGAAACAA TTTTATTAAA 


CATTGTAAAT 


5820 




TTCCATACAT 


TAATTACAAC 


AAAGGCTAGC 


AGAATTCGTC AAATTGCATC 


AAATGATAAA 


5880 


10 


TTAATGGAGT 


TTGGTACACG 


TCGTGCGCAA 


GAAATTGATG CAGCATTGTG 


GGGCGCTAGA 


5940 




GCTGCTTACA 


TCGGGGGCTT 


TGATTCTACA 


AGTAATGTTA GGGCGGGGAA ATTATTTGGT 


6000 




ATACCTGTGT 


CTGGTACACA 


TGCACATGCA 


TTTGTCCAAA CTTATGGAGA 


CGAATATGTT 


6060 


IS 


GCCTTCAAAA 


AATATGCTGA 


AAGACATAAA 


AATTGTGTGT TCCTAGTAGA 


TACATTCCAT 


6120 




ACTTTAAAAT 


CTGGCGTGCC 


AAATGCAATA 


AAAGTTGCAA AAGAATTAGG 


TGACAAAATT 


6180 




AACTTTGTAG 


GTATTCGATT 


AGATTCTGGA 


GATATCGCTT ATTTATCTAA 


AGAGGCAAGA 


6240 




CGTATGCTTG 


ATGAAGCAGG 


ATTTACTGAA 


ACTAAAATTA TCGCGTCTAA 


TGATTTGGAT 


6300 




GAAGAAACGA 


TTACGAGTTT 


GAAAGCACAA 


GGTGCAAAAG TAGATTCTTG 


GGGCGTTGGT 


6360 


25 


ACAAAGCTGA 


TTACAGGATA 


CGATCAACCA 


GCATTAGGTG CAGTATATAA 


ACTTGTAGCT 


6420 


ATTGAAAATG 


AAGATGGTTC 


ATATAGTGAT 


CGTATTAAAT TATCAAATAA 


CGCTGAAAAG 


6480 




GTTACGACGC 


CAGGTAAGAA 


AAATGTATAT 


CGCATTATAA ACAAGAAAAC 


AGGTAAGGCA 


6540 


30 


GAAGGCGATT 


ATATTACTTT 


GGAAAATGAA AATCCATACG ATGAACAACC 


TTTAAAATTA 


6600 




TTCCATCCAG 


TGCATACTTA 


TAAAATGAAA 


TTTATAAAAT CTTTCGAAGC 


CATTGATTTG 


6660 




CATCATAATA 


TTTATGAAAA 


TGGTAAATTA 


GTATATCAAA TGCCAACAGA 


AGATGAATCA 


6720 


35 


CGTGAATATT 


TAGCACTAGG 


ATTACAATCT 


ATTTGGGATG AAAATAAGCG 


TTTCCTGAAT 


6780 




CCACAAGAAT 


ATCCAGTCGA 


TTTAAGCAAG 


GCATGTTGGG ATAATAAACA 


TAAACGTATT 


6840 




TTTGAAGTTG 


CGGAACACGT 


TAAGGAGATG 


GAAGAAGATA ATGAGTAAAT 


TACAAGACGT 


6900 


TATTGTACAA 


GAAATGAAAG 


TGAAAAAGCG 


TATCGATAGT GCTGAAGAAA 


TTATGGAATT 


6960 




AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 


7020 


4S 


TATTTCAGGA 


GGACAGGATT 


CTACATTAGT 


TGGAAAACTA GTACAAATGT 


CTGTTAACGA 


7080 




ATTACGTGAA 


GAAGGCATTG 


ATTGTACGTT 


TATTGCAGTT AAATTACCTT 


ATGGAGTTCA 


7140 




AAAAGATGCT 


GATGAAGTTG 


AGCAAGCTTT 


GCGATTCATT GAACCAGATG 


AAATAGTAAC 


7200 


50 


AGTCAATATT AAGCCTGCAG 


TTGATCAAAG 


TGTGCAATCA TTAAAAGAAG 


CCGGTATTGT 


7260 




TCTTACAGAT 


TTCCAAAAAG 


GAAATGAAAA 


AGCGCGTGAA CGTATGAAAG 


TACAATTTTC 


7320 




AATTGCTTCA 


AACCGACAAG 


GTATTGTAGT 


AGGAACAGAT CATTCAGCTG 


AAAATATAAC 


7380 



55 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7 560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 762 0 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 7680 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7 74 0 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7 800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7 860 

CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTCGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7 980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

AACX3TTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 816 0 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 822 0 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 82 80 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 8340 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 84 00 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 84 60 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 852 0 

TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 8640 

TGAfiAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8 700 

CGTtTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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CATTTAAATC TTGAGGATGC CATTCTCCCT CAATAATATT 
AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG 
TTCTAAAATA CAAATTTTAA TAACCATAAA TAGATGAATA 
TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT 
TCCATATAAT TATAACCTCT TGTCCATTAT CTAATTTAGC 
CATGCCCTGC GTGCATACCA TTTCTTGATT CTACTCTACT 
TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT 
GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT 
GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT 
TGTTAGTATT ATAAGTCGCA TTTAGTAATT CAGACATCGT 
TGTTACCTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT 
AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT 
GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT 
GACGTGGTGT TTTTAGTACT AGTTTAGCTT TATGATTTTG 
TGAGTTGT 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



AAGATAATAC TTAGCCTCTG 
TTCCATAATA TTATTTACCT 
CCATCGATAA TGGTCGCCAT 
AAAACCATTA TCCCATGGAT 
GTTCCCAACA ACTGCCATGG 
ACCTAAAACA GCAATTCCTT 
CATTCTATTA AGAAGTTGTG 
TGGCGTTAAA CCAGTAAATT 
CACAGCTTCT GCATGATATT 
ATAGCCTGCA CACCAACCAT 
ATTTATATAT TGsTCGTTGT 
GACAGAAGCT TTTTGCTTTA 
TnTnAACTnT nTATCTTCTA 
AGTACCACAT AGTAACCTTT 



9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10088 



Z (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 60 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 240 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 300 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 360 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 420 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 480 
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TGCCACACTC CTTTTTGATT GAATTAGCAT TTTACGATCA TAAACAGTCA TTATAATTGA 600 

GTATTTGAAC ATAAAAATGT AATTTTATCG TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

GTAATTTATG ATTGAAAAGT GAAAGCGTAC TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

GATGATAATT ACTGaAAAAA GACACGAGTT AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TTTGACTTTA CAAGAATTAA TAGATCGAAC TGGTTGCAGT GCTTCAACAA TACGArGAGA 84 0 

TTTATCTAAA CTACAACAAT TAGGGAAATT GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

AGAAAATCX3T ATGGTTGAGG CGAATTTAAC TGAAAAATTA GCAACGAATC TTGATGAAAA 960 

GAAAATGATT GCTAAAATAG CAGCTAATCA AATCAACGAT AATGAATGCT TATTTATCGA 1020 

TGCTGGTTCA TCTACATTGG AGCTAATTAA ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

AACCAATGGT TTAACACATG TAGAAGCTTT ACTTAAAAAA GGTATTAAAA CAATTATGCT 1140 

AGGTGGTCAA GTTAAAGAAA ATACACTTGC TAOSATTGGT TCTAGTGCTA TGGAGATATT 1200 

AAGACGATAT TGTTTCGATA AAGCTTTTAT CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

ATTAACTACT CCCGATGAGC AAGAGGCATT AGTTAAACAA ACAGCAATGT CATTAGCCAA 1320 

TCAATCATTT GTACTTATAG ATCATTCTAA GTTTAATAAA GTATATTTTG CTCGTGTACC 13 80 

TTTGCTAGAA AGTACGACAA TCATCACATC TGAAAAAGCA TTAAATCAAG AATCGTTAAA 1440 

AGAATACCAA CAAAAGTATC ACTTTATAGG AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATCCTTCAA TTGACTATGT CATTTTTACG AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

GCAACAGCAA CATATAAATT CGCTGGGGGG AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 1620 

ACATTGGATG TTGAGTCAAC TGCCTTGGGA TTTGCAGGTG GATTTCCTGG GAAATTCATT 1680 

ATAGATACAT TAAATAACAG TGCAATTCAA TCGAATTTTA TTGAAGTTGA TGAAGATACA 1740 

CGTA3TAATG TGAAATTAAA AACAGGACAA GAAACAGAAA TCAATGCACC GGGTCCTCAT 1800 

ATAACGTCAA CACAATTTGA ACAACTGTTA CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT AGTATTCCAA GCGATGCGTA TGCGCAAATT 192 0 

GCACAAATTA CAGCACAGAC AGGTGCTAAA TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

GAAAgCGTTT TACCATATCA TCCACTATTT ATTAAACCTA ATAAAGATGA ATTAGAAGTG 204 0 

ATGTTTAATA CAACAGTGAA CTCAGACACA GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

GATAAAGGTG CGCAATCTGT TATTGTCTCG CTTGGCGGTG ATGGTGCTAT TTATATTGAT 2160 

AAAGAAATCA GTATTAAAGC AGTTAATCCA CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GGTGATAGTA CAGTTGCAGG CATGGTGGCT GGAATTGCTT CAGGTTTAAC GATTGAAAAA 2280 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA TGGGGAGTGA 24 00 

AAATAATGAG AGTAACAGAG TTATTAACAA AAGATACAAT AGCAATGGAT TTAATGGCAA 2460 

^ ATGACAAAAA TGGTGTTATT GATGAGTTAG TAAATCAATT AGACAAAGCA GGTAAATTAA 2520 

GTGATGTCGC GTCATTTAAG GAAGCGATTC ACAATCGAGA ATCACAAAGT ACAACTGGTA 2580 

TCGGCGAAGG TATTGCCATT CCACATGCCA AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 2 64 0 

w 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 2700 

TATTCTTTAT GATTGcAGcG CCAGAAGGTG GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 2760 

AGTTGTCTGG TATTTTAATG GATGAAAATG TACGTGAGAA ATTATTACAT GCTTCATCAC 2820 

CTGAAGAAGT ACTAGCGATC ATAGATGAGG CTGATGATGA AGTGACAAAA GAAGAAGAGG 2880 

CAGAAGCTGA AGCACAACAA GTTGCAACTG CAGAACAATC ATCTAAACAA TCTAATGAGC 2940 

20 CATATGTGTT AGCAGTAACT GCTTGTCCAA CAGGTATTGC ACACACATAT ATGGCACGTG 3000 

ATGCATTGAA AAAGCAAGCG GATAAAATGG GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

CAAGCGGCAT TAAAAACCAT TTAACTGAAC AAGATATTGA AAATGCAACA GGTATCATTG 3120 

TTGCTGCTGA TGTTCATGTT GAGACGGATC GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG AATTAATTAA TAAAGCATTA GATACAAGTC 3 24 0 

GTAAACCTTT TGTTGCCCGT GATGGTCAAA GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 3300 

30 

AATTAAGCCC AGGTAAAGCA TTCTATAAAC ACTTAATGAA CGGTGTTTCT AACATGTTGC 3360 

CACTTGTAAT ATCTGGTGGT ATTTTAATGG CAATTGTATT TTTATTTGGA GCAAATTCAT 3420 

TTAATCCAAA 7VAGCTCAGAG TACAATGCX3T TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 34 80 

AAAGTGCATT CGCGTTAATC ATTCCAATTT TATCTGGATT CATTGCACGT AGTATTGCGG 3540 

ATAAACCTGG TTTCX3CTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3600 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT TAGCAGGTTA CTTAACACAA GGTGTTAAAG 3660 

CCATGACACG TAAGTTACCA CAAGCATTAG AGGGATTAAA GCCAACATTA ATTTATCCAC 3720 

TATTAACAGT GACGGCTACA GGCTTATTGA TGATTTATGC CTTTAATCCA CCAGCATCTT 3780 

GGTTAAATCA TTTGTTATTA GATGGATTAA ACAATTTATC AGGTTCTAAT ATTGTATTAT 3 84 0 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA TTGATATGGG CGGTCCATTC AACAAAGCGG 3900 

CATATGTTTT TGCAACAGGT GCGTTGATTG AAGGTAATGC AGCACCAATT ACAGCTGCAA 3960 

SO 

TGATTGGTGG TATGATTCCA CCGTTAGCAA TTGCGACAGC GATGTTAATT TTTAGACGTA 4020 

AATTTACAAA AGAACAACXJT GGTTCAATTA TCCCTAACTA TGTGATGGGT ATGTCATTTA 4080 

55 
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TC5ATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 

^ TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 
TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 

,5 CAAATGTATA GACTTTTTTA ATATTTTGCA 

AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 

20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 
TTTATTGATA TACATATTCA TGGTGGTTAT 
GGCTTAAAAT ATCTATCCGA AAATTTGTTG 
ACAATGACGC AATCGACTGA TAAAATAGAT 
GCGGAgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

35 GAAATTGAAG GTGCAAAAGA AGCGCTTGAA 

GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

40 GCAGCATGGT TGAATGATGC tctacatacc 
CCGGCATCGG ttgcaattgc ttaccgtatg 
gatgcaatgc gtgcaaaagg tatgcctgaa 

4S 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAATCA AGCCATTGCA 

£0 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT AGGTTCACGA ATTACTGCGC 4 200 

CTGATGGTGC ACACTTACTT CAAACTCTTA 4 260 

CATTAATTTA CGGTTTAATC AAACCAAAGT 4320 

CAATGGACGA GTAGTTTTAA TGATGTAAAA 43 8 0 

TGTATGTTCA ATGAATATAT GTTAGTTTTA 444 0 

CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4 500 

TTAGATAAAT GAATGAAGAA ATAGACACCA 4 56 0 

AAAAGTTATG CCAAACXJAAG CAGATATAGT 4620 

TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4680 

GAAGATGGCA AAATCGATAA TGGTTACATT 474 0 

GGAGAAGTGG ATGATAAAGC AGCAATTGAT 4 800 

GATGCTAAAG GTCATCATGT ATTACCAGGT 4860 

GGTCAAGATG CAATGGATGG GTCATACGAT 4 920 

TCTGAAGGGA CGACATCATA CTTGGCCACT 4 98 0 

AATGCACTTA CAAATATTGC TAAATATGAA 504 0 

ATTGTAGGTA TACATTTAGA AGGACCATTT 5100 

CCGCAATACG TTGTACGCCC ATTTATCGAT 5160 

GGATTAATAA AGATTATGAC GTTTGCACCT 5220 

ACGTATAAAG ATGACATTAT TTTTTC7VATT 5280 

GTTGAAGCTG TTGAGCGAGG AGCTAAACAT 5340 

TTCCAACATA GAGAACCAGG TGTTTTTGGA 54 00 

GAAATGATTG TTGATGGCAC TCATTCTCAT 54 60 

AAAGGTAATG AACGTTTTTA TTTAATTACC 5 520 

GGAGAATATG ATTTGGGTGG ACAAAAAGTA 55 80 

AATGGTGCGC TTGCTGGTAG TATTTTAAAA 564 0 

TTTACAGGTG ATACATTAGA TCATTTATGG 5700 

TTAGGTATCG ATGATAGAAA AGGTAGTATT 5760 

CTAGATGATG ATATGAATGT AAAATCTACA 5820 

TAATAAATAA TCATAATTAA ATGTATGCAA 5 880 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 6060 

^ AATAGGGAAT TAATTGGAAA CTTCX5ACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 618 0 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 624 0 

10 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 63 60 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 6480 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 654 0 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6 78 0 

TGTAGACGAA TTACTAGAAA C7VATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 6 84 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6900 

30 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 6960 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7020 

TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCX5TTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 714 0 

TGATSATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 73 20 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 73 80 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 74 4 0 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

SO 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 

SS 
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(A) LENGTH: 34 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

TTATATCAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 60 

sATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 12 0 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 240 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 3 00 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 420 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 540 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AGCAGCATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 780 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 1080 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 1140 

CGATTAATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 1200 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 1380 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 144 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAPlATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 174 0 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 186 0 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 1980 

AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 204 0 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 22 8 0 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 234 0 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 24 0 0 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 2460 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 2 640 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACA5ACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2820 

ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2 880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2 94 0 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3 00 0 

CATTGGCTTC TTTTATCT^T GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 306 0 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3240 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCTUVG ACCAATGTTT TTTAACGCTT 3300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 34 20 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 BO 

TCCACATATG CT 34 92 
(2) INFORMATION FOR SEQ ID NO: 35: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 

CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 120 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 180 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 24 0 

AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 3 00 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 3 60 

TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 4 20 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 480 

GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 540 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 660 

CTTAAAAGCA TTAC5ATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 720 

AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 780 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 840 

TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 

TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 96 0 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 102 0 

CATTATTAGA TCACGAACT^ TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 10 8 0 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 1140 

AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 1200 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 13 20 

TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 13 BO 

CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG 14 4 0 

CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 

AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 1560 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 1620 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT GGCACGTGGT GGTATTATTG 1680 

ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT TAGTCGGGCA GCTATCGATG 1740 

TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 

CACCTCATTT GGGTGCTTCA ACAGTCX3AAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT GCACCTAAAA 1920 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7G20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 

TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 

AAAT^AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 180 

TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 300 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 

ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 4 20 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 4 80 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 540 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 600 

TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 



329 



EP 0 786 519 A2 



GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 780 
TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 84 0 

5 

CTATGCAAAT AAAATATTCC ATAACAAAGT ATATACTTTA CATTTTTATA ATTCTTAACA 900 
ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 96 0 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 10 20 

(0 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 1080 

CATCGTC5GAA ACGGAAGCTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGGCGG 114 0 

,j TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGTGGT GGTACAATTT ATGCACATGT 12 00 

CATGCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CTGAAGGCGT 1260 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 13 20 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 13 80 

GGCTATCGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 144 0 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 1560 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATTACTATCA 1620 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 16 80 

30 

TGATTGTTTT TCTTTGTATC CATCATATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 1740 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 1800 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG 1860 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAGGG 192 0 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAGCTT GTATCCTTTA 198 0 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 2040 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATTCAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 2160 

GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 22 80 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 23 4 0 

SO 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 2400 

ACTGACAGCG GCTCTTGCAG CTGC7VACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 24 60 

55 
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ATTTAAAGAT 


TATAGCCAAG 


TAGCATATAA 


CGAACATTTA 


CATAGTAAAT 


TTAATGCCAC 


2580 


TGAAGTATTC 


TTCATTCAAT 


TTACAATCGT 


TGTATTCTGT 


ATGGCAGTTG 


GAAGTTATTT 


2S40 


CAGTCATTTG 


TTTACAGCTC 


AAACAGGGAT 


TAATGTTCCA 


ATTTACGTTG 


GCTCATTATT 


2700 


TGTAGCTGTT 


ATTGTCCGAA 


ATATCTCTGA 


AAGTTTTAAT 


TTTAATATTG 


TAGATTTAAA 


2760 


AATTACTAAT 


CAAATTGGCG 


ATGTCGCATT 


AGGTATTTTC 


TTATCTCTTG 


CGCTAATGAG 


2820 


CATTCAATTA ATCGAAATTT 


ATAAACTTGC 


TATACCTCTT 


ATTATTATCG 


TTTTAGTTCA 


2880 


AGTTGTCGTT 


ATGATTTTAT 


TTGCTGTTTT 


AATTTTATTT AGAGGTTTAG 


GAAAAGATTA 


2940 


TGATGCTGCA 


GTAATGGTAG 


GTGGTTTTAT 


CGGTCATGGG 


CTTGGTGCAc 


GCCAAATGCC 


3000 


ATGGCAAATT 


TAGATGTTAT 


TACTAAAAAA 


TATGGAAACT 


CACCTAAAGC 


ATATTTAGTT 


3060 


GTACCTATTG 


TTGGTGCATT 


CTTAATCGAT 


TTAATTGGTG 


TTATAGTCAT 


TATGGGATTC 


3120 


ATACAATGGT 


TTAGTTAAAC 


ACCAAACTCA 


TAAATAAAAG 


AGGAGGCCTT 


CGCCTCcTcT 


3180 


TTTATTTATC 


CTCGATGTAT 


ATTCAAGTTA 


CGTTGTTCTA 


TCCATGACAA 


TATTTCCGGA 


3240 


CTAAATACGA 


TTTGTTTTTG 


TGTTAAGTCG 


TCAATATTTT 


TAGCATCTAA 


CATCGTCATT 


3300 


ATTGATTTCA 


TGTGTTCAAT 


AAATGATTCT 


ACATAAGCTA 


CTGTATGTGC 


AATGCCATTA 


3360 


TTTTCAACTT 


GATTTAAAAA 


CGGACGTGAC 


ATACCAGTTG 


CCTTTGCACC 


AAGTGCTAAA 


3420 


CTTTTAATTG 


CATCGAGTGG 


TGTACGTAAA 


CCACCACTCG 


CGAAAACTGA 


AATTTCGCTT 


3480 


TGATAAGCCG 


TTGTTTCAAG 


TAATGACTCA 


ACTGTAGACT 


GTCCCCATGA 


TGATAAGTAA 


3540 


TCCATATCTT 


TATTTGCACG 


ACGTTCATTT 


TCAATATCTA 


CAAAGTTAGT 


ACCACCTTTG 


3600 


CCACTAACAT 


CGACATACTT 


GACGCCTATT 


TGTTGTAAGT 


CATGCATTAA 


TTCTTTGCTC 


3660 


ATACCAAATC 


CAACTTCTTT 


TATAATGACT 


GGAACAGACA 


CTCGTGATAC 


AATCGACGCT 


3720 


ATA-CfATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA 


TTCTTGAGGA 


3780 


GAAtTAACAT 


GGATTTGTAA 


CGCTTGTGCC 


TCAAGTAATT 


CAACTGCTTC 


CAAAGCCTTT 


3840 


TCTACTGGTA 


CGTCCGCACC 


AACATTGCTA 


AAAATCATGC 


CTTCAGGATT 


CATTTTTCGC 


3900 


GCAATCGTAA 


ACGTCTCAGC 


CATGCGTGGA 


TTTCTCAATG 


CCGCATGTGT 


TGATCCAACT 


3960 


GCCATCGCTA 


AGCCAGTTTC 


TCTTGCAACT 


ACAGCTAGCT 


TTTCATTGAT 


GTTTTTCGTC 


4020 


CACTCGCTAC 


CACCCGTCAT 


TGCATTAATA 


TAAACCGGAT 


ATGCCATCGT 


TAAGTCAGGC 


4080 


GTCTGTGATG 


TCAAATCGAT 


ATCATTTACA 


TTAATTGATG 


GGATAGAATG 


ATGCACAAAA 


4140 


CGCATCTTAT 


CAAAATCTGA 


ATGCATTGCG 


TCAGATTGGG 


CCATTGCTAT 


TTCAACATGT 


4200 


TCATTTTTTC 


TCTGTTCTCT 


TTGAAAATCA 


CTCATGATTA AACCTACCTT 


TTCGTCATTT 


4260 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 43 8 0 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 444 0 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4 500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 560 

ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACX5TTTTA CTTTAATCAA ATCCGAACX3T 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 474 0 

AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4 800 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4920 

TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4 980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 504 0 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 5160 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 5280 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 54 00 

ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 5460 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 5580 

GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACTAGCATCA ACAAATGAAG 554 0 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 5880 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 594 0 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 624 0 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6 300 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 63 50 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 64 20 

to 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 64 80 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 6540 

,5 ATCCTTCTAT TGTGCCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 6720 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 6 84 0 

GaTCATAAAC ACGTTGTATC GCTTGGrLAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6 9 00 

25 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6 960 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 7080 

30 

CTAAATTTGC TTTAACATTT AATGTTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 714 0 

CTCTTTTATA AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 7200 

GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 7260 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 7320 

GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 73 80 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 744 0 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACXy^TTTG ATGAGACATT 7 500 

TCAACAACCG AATATTCATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7560 

"'^ CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 762 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 





GTCATtACCG 


amTTTCtTAG AaTCATTTAA AGATGATAAA TATACAAACG 


TTGGTAATTT 


60 




AAAAGAAGTG 




AAATTGCTGC 


GACGAAACCC GAAGTAATCT 


TTATCTCTGG 


120 




ACGTACAGCT 


AATCAAAAGA 


ATTTAGATGA 


ATTCAAAAAA GCTGCACCTA 


AAGCGAAAAT 


180 




TGTTTATGTT 


GGTGCAGATG 


AAAAGAACTT 


AATTGGTTCA ATGAAACAAA 


ACACTGAAAA 


240 


10 


TATCGGAAAA 


ATTTACGATA 


AAGAAGATAA 


AGCTAAAGAA TTAAATAAAG 


ATTTAGATAA 


300 




CAAAATTGCT 


TCAATGAAAG 


ATAAAACGAA 


AAACTTCAAT AAAACTGTTA 


TGTATTTACT 


360 


IS 


AGTTAACGAA 


GGTGAATTAT 


CAACATTTGG 


ACCTAAAGGT CGTTTTGGTG 


GATTAGTTTA 


420 




CGATACATTA 


GGATTCAATG 


CAGTTGATAA 


AAAAGTAAGT AATAGCAATC ATGGACAAAA 


480 




TGTTTCTAAC 


GAATATGTTA 


ATAAAGAAAA 


TCCAGATGTT ATTTTAGCGA 


TGGATAGAGG 


540 


20 


TCAAGCGATA 


AGTGGTAAAT 


CAACTGCGAA ACAAGCATTA AATAATCCTG 


TATTAAAAAA 


600 




TGTTAAAGCA 


ATTAAAGAAG 


ACAAAGTATA 


TAATTTAGAT CCTAAATTAT 


GGTACTTTGC 


660 




AGCTGGATCA 


ACTACAACTA 


CAATTAAACA 


AATTGAGGAA CTTGATAAAG 


TTGTAAAATA 


720 


25 


ATTTTAAAAG 


AGGGGAACAA 


TGGTTAAAGG 


TCTTAATCAT TGCTCCCCTC 


TTTTCTTTAA 


780 




AAAAGGAAAT 


CTGGGACGTC 


AATCAATGTC 


CTAGACTCTA AAATGTTCTG 


TTGTCAGTCG 


840 




TTGGTTGAAT 


GAACATGTAC 


TTGTAACAAG 


TTCATTTCAA TACTAGTGGG 


CTCCAAACAT 


900 




AGAGAAATTT 


GATTTTCAAT 


TTCTACTGAC 


AATGCAAGTT GGCGGGGCCC 


AAACATAGAG 


960 




AATTTCAAAA 


AGGAATTCTA 


CAGAAGTGGT 


GCTTTATCAT GTCTGACCCA 


CTCCCTATAA 


1020 




TGTTTTGACT 


ATGTTGTTTA 


AATTTCAAAA 


TAAATATGAT AGTGATATTT 


ACAGCGATTG 


1080 


3S 


TTAAACCGAG ATTGGCAATT 


TGGACAACGC 


TCTACCATCA TATATTCATT 


GATTGTTAAT 


1140 




TCGTSTTTGC 


ATACACCGCA 


TAAGATTGCT 


TTTTCGTTAA ATGAAGGCTC 


AGACCAACGC 


1200 


40 


TTAATGGCGT 


GCTTTTCAAA 


CTCATTATGG 


CACTTATAGC ATGGATAGTA 


TTTATTACAA 


1260 




CATTTAAATT 


TAATAGCAAT 


AATATCTTCT 


TCGGTAAAAT AATGGCGACA 


scgTGTTTCA 


1320 




GTATCGATTA 


ATGAACCATA 


AACTTTAGGC 


ATAGACAAAG CTCCTTAACT 


TACGATTCCT 


1380 


45 


TTGGATGTTC 


ACCAATAATG 


CGAACTTCAC 


GATTTAATTC AATGCCAAAT TTTTCTTTGA 


1440 




CGGTCTTTTG 


TACATAATGA 


ATAAGGTTTT 


CATAATCTGT AGCAGTTCCA 


TTGTCTACAT 


1500 




TTACCATAAA 


ACCAGCGTGT 


TTGGTTGAAA 


CTTCAACGCC GCCAATACGG 


TGACCTTGCA 


1560 


SO 


AATTAGAATC 


TTGTATCAAT 


TTACCTGCAA 


AATGACCAGG CGGTCTTTGG 


AATACACTAC 


1620 




CACATGAAGG 


ATACTCTAAA 


GGTTGTTTAG 


ATTCTCTACG TTCTGTTAAA 


TCATCCATTT 


1680 
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AGTGTTCTTT TTGAATAATG CTATTACC3AT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC 186 0 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 1920 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 2 04 0 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2 ISO 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 23 4 0 

CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 2400 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 2460 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 252 0 

TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 264 0 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 2700 

CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 28 80 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 294 0 

AAACH-CCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3060 

AAAAAATTGT TATTATCXSCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 324 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 3300 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 336 0 

CAATGAAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 3420 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 3480 
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AATTAGCTGA 


TAATAAAAGT 


GAAGCAACTA ATCTTACGAC 


AAAATTAGAA CATAATAATA 


3600 




AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 


3660 




GCGCGATTAA 


AAATCACATT 


ATGCCAATGA TTGAAAAGCA 


AATTACCGAT ATTAACCAAA 


3720 




CTAATATTAG 


TGATAAGCAT 


GTTAATAATG CAAGGAAAAA 


CGCAATAGAA ATGTATTACA 


3780 


10 


GTCTG CAGAA 


CTATTATAAT 


ACACGTATTG AAACAATAAA 


GGTTAGTGAG AAGTTATCAm 


3840 


AAGTCGATGT 


AGATAAGTTG 


CCGAAAAAGG GTATAGATAT 


AACTCACGGC GATAAAGCCT 


3900 




TTGAAAAAAA 


GCTTGAAAAA 


TTAGAAGAAA AATAACTATA 


ATCATTTTTC AAAGTTAAAA 


3960 


15 


ATTTTGAATT 


TATGGTTAAC 


ATGTCAACTT ACTATGTGTA 


TAATGGTAAA CATTGATATT 


4020 




AACTATATGT 


ATAAAAATGT 


CACGCAGATG CTATTTAAAT 


GTGATAAATA TTTTTAGAGG 


4080 




TGAATAGAGT 


GGCTATAAAG 


CTAAGTTCAA TTGACCAATT 


TGAACAGGTT ATTGAGGAAA 


4140 


20 


ATAAATATGT 


TTTTGTATTA 


AAACATAGTG AAACTTGTCC 


AATATCGGCA AATGCGTACG 


4200 




ATCAATTTAA 


TAAATTTTTA 


TATGAACGCG ATATGGACGG 


TTATTATTTG ATTGTCCAAC 


4260 




AAGAACGCGA 


TTTGTCAGAT 


TATATTGCTA AAAAAACGAA 


CGTTAAACAT GAATCACCTC 


4320 


25 


AAGCATTTTA 


TTTTGTAAAT 


GGTGAAATGG TTTGGAATCG 


AGACCACGGT GATATCAATG 


4380 




TGTCGTCATT 


AGCACAAGCA 


GAAGAATAAT GAAACTATAG 


GGTTGGAACA TTTTGCCTTA 


4440 




CACTACTAGA 


CGTGAATAGC 


ACAACTTAAA TTCGTGTGAA 


TCAGAGTAGT TTGGCTATAA 


4500 




TGATGTTCTG 


ACCTTTTATT 


TTATGTCACC TTTAGAAGCA 


GTTAAGTTAG TACTTTTTTA 


4560 




CAAACATATG 


TATAATATAT 


TCGAGTATTT TTATTGAAAa 


tATTTTGGAA AACGACGAAT 


4620 


35 


CCAATAAGAA 


AATTTAAACA 


TGATTTGTAA GTTAGTTTAA 


TAGGAAATAT ATGCTAAACC 


4680 




AAAAGAAGCA 


TATTGTTATT 


TACTGGAATA ATTAATAATC 


ATGTCATGTT AAATGTTAGC 


4740 




ATATAATCAC 


GAGATAAAAT 


CTAAAATTTA AGATTAATCT 


TTTATGAATA AAAAACGTAT 


4800 


40 


CACAACAAAT 


AATAAAGTAA 


GGTGGTCAAG GTTATGAAAG 


TATTAGTAGC CATGGATGAG 


4860 




TTTCATGGAA 


TTATTTCAAG 


TTATCAAGCT AATAGATATG 


TTGAAGAGGC AGTTGCAAGC 


4920 




CAAATTGAAA 


CTGCAGATGT 


AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 


4980 


45 


GATTCTGTAT 


TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TGATGCAGAT 


5040 




ATGAATGAAG 


TTGAAGGTGT 


TTACGGACAA ACTGATACAG 


GGATGACCGT TATCXJAGGGG 


5100 




AATTTATTTT 


TAAAAGGTAA 


AAAACCAATT GTTGAACGAA 


CAAGTTATGG TTTAGGAGAA 


5160 


50 


ATGATTAAAC 


ATGCATTAGA 


TAACGACGCA AAACATGTTG 


TAATTTCACT AGGTGGGATT 


5220 




GATAGTTTTG 


ATGCTGGTGC 


AGGTATGTTA CAAGCATTAG 


GTGCTCAATT CTATGATGAC 


5280 



SS 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 54 00 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 54 60 

^ AATCATAATC AAGCAGCAGA TiATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 55 8 0 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 5 64 0 

W 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 5 700 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 594 0 

TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCGAAAG TATACGTAAA AAATATGAAA 6060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 6 24 0 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 6 3 00 

^° CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 63 6 0 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 642 0 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 64 80 

35 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 6540 

ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6660 

40 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 67 80 

45 CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6840 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 696 0 

50 CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 702 0 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 70 80 
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GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 7260 

AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 73 80 

ACCCCTTAAA CTCTATTATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 75 00 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCOACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCXZAAC TTTTCCAAAC TCTAAATCTA 7740 

CCGTTAATTG GTAAATCGTC CATGCAATGG CACCCACAAA TCCACATGCT ACTAAGAGGC 78 00 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

GACACCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7980 

AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8040 

TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCXXSAT CCGCCAGCAA TGACTGCAAT 8220 

CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 82 80 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 8340 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 8400 

TAAAGCGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 8460 

TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCXSATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

TTCaAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CATCTATAAT 864 0 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 8700 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 8880 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA ACACCGATAG ACACTGACAA 900 0 

TTTAATAACT TCTTTGTTTG GTAAATGGAA TGATGATTTT TCAACACCCG AACGAATATT 9060 

^ TTCAGCTAAT TTAACACTTT GATCAAGTGA ATAATTGTGA ATGACAACTG AGAACTCTTC 912 0 

GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG TTTTTAAGTA ATTGAGACAT 918 0 

TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA TCATTGaCAT CTTTAAATCC 924 0 

ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT TTTTCAGCTT TTCGTGAAAT 9300 

TTCATTTAAA TGTCTATCAA. ATTCTTTTAC ATTACCTAAG CCTGTTAAGT AATCATATTT 93G0 

ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC CAAATATCGA CAAATGTTAT 9420 

T5 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA ATGATAACTT CCGATAGTGT 9480 

GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT ATTGTAACAA CATTAAGTAT 9540 

20 TAATAATGAT AGCACATCAT TTTGTTTTAA AAATGGTCCA ATAGCACTTG TTACTGCAGC 9600 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA AATACTACAA TTTCAACAAT 9660 

TGCTACAATT ACTGTGGCAG ATAATGTATA GACCATATTT GTAAATCTAC CTAAAAACAA 9720 

25 TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA TAAGGGATAG GGTAGACAGA 9780 

TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA GCCTTAGAAA AAAC 9834 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

-(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

40 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

4S GAGCTTGAAG CGGTAAGTAT TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 24 0 

AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 300 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGGCATATT AAAAGCTCAA 360 

SO GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 420 

TTTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 4 80 



339 



EP0 786 519 A2 

TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 600 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 660 

s GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 720 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 78 0 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 84 0 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 90 0 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 96 0 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

IS 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 1140 

ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

25 GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 13 80 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 144 0 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

^° TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 156 0 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

35 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 1740 

TAACAGCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 1860 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 1980 

45 TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 2040 

GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGGCATTTA 2160 

50 TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 222 0 

GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 2280 
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ATGCXJACAGG 


TACAGCTACA 


CCGATTGCAG 


GATTTTTAGT 


TATGTTTGGA 


TTTAATCATC 


2400 




CGACGACAAT 


TGTGATTTAT 


GGTGTAGTAA 


TGGCGATTGT 


AGGTGCGCTT 


GCAGGTTATC 


2460 




TTGGTTCAAT 


TGTATTTAAA AAATATCCAA 


TTGTTACTAA 


GCAAGACATG 


ATTAATCGAG 


2520 




GTGCAGTAGA 


CG CAT AG CAT 




AATAGTAAAA 


ACAAATAAAA 


CATAGTAACG 


2580 




TGATTCAGTC 


GATGTAACAG 


TCGATAATGA 


GTCACGTTTT 


TTTATAGAAA AATACAAGAC 


2640 




ATAAAAATGT 


CATAATTTAT 


TGT C G A CAAA 


TATCATACTG 


TATAAACATT 


TATCATTTTC 


2700 




TCAAGTACCT 


TTTACACGAT 


GGAATGAACT 


TACTTTTTAC 


GAAATTATGC 


GTATTTTATA 


2760 




AACAAATATC 






AAGCGTTTAC 


AACAGAAATA 


ACAGCATGCT 


2820 


IS 


ACGATATTTT 


TGTAAATTCA 


CTGATTCAAG 


TATTTTAAGT 


CAATATGAGG 


AGGGATGTTA 


2880 




TGAGCGATTC 




ATTTTAAAAA 


GAATTAAAGA 


TAATCCGTTT 


ATTTCACAAC 


2940 


20 


GTGAACTTGC 




GGATTATCTA 


GACCCAGCGT 


AGCAAACATT 


ATTTCAGGAT 


3000 




TAATACAAAA 


GGAATATGTT 


ATGGGAAAGG 


CATATGTTTT 


AAATGAAGAT 


TATCCTATTG 


3060 




TTTGTATTGG 


CGCAGCGAAT 


GTAGATCGTA 


AGTTTTATGT 


GCATAAAAAT 


TTAGTTGCAG 


3120 


25 


AAACATC7\AA 


TCCTGTAACG 


TCAACACG CT 


CTATTGGTGG 


CGTAgCAAGA AATATTGCTG 


3180 




AGAACTTAGG 


TAGGCTTGGC 


GAAACGGTCG 


CTTTTTTATC 


TGCTAGTGGA 


CAAGATAGTG 


3240 




AATGGGAAAT 






CATTTATGAA 


TTTGGATCAT 


GTTCAACAAT 


3300 


30 


TTGAAAATGC 




TCATATACAG 


CTTTAATTAG 


TAAAGAAGGC 


GACATGACAT 


3360 




ATGGCTTaGC 


AGATATGGAA 


GTGTTTGACT 


ACATTACGCC 


TGAATTTTTA 


ATTAAGCGTT 


3420 




CACACTTATT 






TTGTAGATTT 


GAATTTAGGC 


AAAGAGGCAT 


3480 




TAAACTTCTT 


ATGTGCCTAT 


ACCACGAAAC 


ATCAAATCAA 


ATTAGTTATC 


ACCACGGTTT 


3540 




CTTCCCCAAA 






CATTACATGC 


TATTGATTGG 


ATTATCACGA 


3600 




ATAAAGATGA 






TAAAAATAGA 


ATCTACTGAT 


GATTTAAAAA 


3660 


40 


TAGCTGCTAA 


ACGCTGGAAT 


GATTTAGGTG 


TTAAAAATGT 


TATTGTGACA 


AATGGCGTGA 


3720 




AAGAACTCAT 


TTATCGAAGT 


GGTGAGGAAG 


AAATCATTAA 


GTCAGTTATG 


CCATCAAATA 


3780 


4S 


GTGTGAAAGA 


TGTTACAGGT 


GCAGGCGATT 


CATTCTGTGC 


TGCAGTAGTG 


TATAGCTGGT 


3840 




TAAATGGGAT 


GTCTACTGAA 


GATATATTAA 


TTGCTGGTAT 


GGTTAACGCA 


AAGAAAACGA 


3900 




TAGAAACGAA 


ATATACAGTT 


AGGCAAAACC 


TAGATCAACA 


GCAACTTTAT 


CACGATATGG 


3960 


SO 


AGGATTATAA 


AAATGGCAAA 


TTTACAAAAG 


TATATTGAGT 


ATTCTCGAGA 


AGTTCAGCAA 


4020 




GCACGGGAGA ACAATCAACC 


GATTGTAGCA 


TTAGAATCAA 


CAATTATTTC 


GCATGGTATG 


4080 



55 



341 



EP 0 786 519 A2 





GCCATTCCAG 


CAACCATAGC 


CATTATAGAT 


GGCAAAATTA 


AAATTGGTTT 


AGAAAGCGAA 


4200 




GATTTAGAAA 


TACTGGCAAC 


TAGTAAAGAC 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


5 


GAAGTTATTG 


CGATGAAGTG 


TGTTGGTGCT 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 




GCAATGGCTG 


GTATTCAATT 


TTTTGTTACA 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 




GAACATACGA 


TGGACATTTC 


AGCAGACTTA 


GAAGAACTGT 


CTAAAACAAA 


TGTCACTGTT 


4440 


ATCTGTGCAG 


GTGCCAAATC 


AATTTTAGAC 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 




AAAGGCGTTC 


CAGTTATTGG 


ATATCAAACG 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


75 


AGCGGTGTTA AGTTAACAAG 


TTCGGTTGAA ACGCCAGAAC 


GACTTGCTGA 


CATTCATTTA 


4620 




ACAAAACAGC 


AGTTAAATCT 


TGAAGGTGGC 


ATTGTTGTTG 


CTAATCCAAT 


TCCATATGAG 


4680 




CATGCCTTAT 


CAAAAGCATA 


TATTGAGGCA 


ATCATAAATG 


AAGCTGTTGT 


TGAAGCGGAA 


4740 


20 


AATCAAGGTA 


TTAAAGGTAA 


GGACGCCACA 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4800 




ACGAATGGTA 


AAAGTTTAGC 


AGCAAATATA 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 




GCTAAAATTG 


CTGTCGCTGT 


TAATAAATTA 


TTGTAGGTGA 


TGATACATGA 


ATATTTTATT 


4920 


25 


CGCTATCACA 


GGGATAGCAT 


TTGCACTATT 


TGTTGCGTTT 


TTATTCAGTT 


TTGATCGTAA 


4980 




AAAAATAGAC 


TTCAAAAAGA 


CGTTAATAAT 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 




TATGATGAAC 


ACAACGATTG 


GTTTGACAAT 


TTTAACTGCA 


CTAGGTTCAT 


TTTTTGAAGG 


5100 


30 


GCTAATAAAT 


ATTAGTAAAG 


CAGGCATAAA 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 




TGGCTTTACG 


TTCTTTTTAA 


ACGTATTACT 


GCCATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 




CATCTTTAAT 


TATATTAAGG 


TATTACCATT 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


TAAAATAACT 


AGAATGGGGC 


GCTTAGAAAG 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 




gca;k:cagaa 


GTATATTTAA 


CAATAAAAGA 


TATTATTCCA 


AGATTATCTA 


GAGCGAAATT 


5400 


40 


ATATACAATT 


GCGACGTCTG 


GTATGAGTGC 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 




GCAGATGATT 


GAACCCAAGT 


TCGTAGTTAC 


AGCAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 




TATCATCGCC 


AGTGTAATCA ATCCCTATAA 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


45 


CTTAACGAAA 


TCCACAGAAA 


CTAAAACATT 


GAATGGAAAA ACAGGAAAAC 


CTAAGAAAGT 


5640 




TGCCTTTTTC 


CAAATGATTG 


GTGATAGTGC 


GATGGATGGG 


TTTAAAATCG 


CTGTTGTAGT 


5700 




AGCCGTAATG 


TTGTTAGCAT 


TTATTTCATT 


AATGGAAGCA 


ATTAATATCA 


TGTTTGGTAG 


5760 


SO 


TGTTGGTTTG 


AACTTTAAAC 


AGCTTATTGG 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 




GGGGATTCCA TGGAGCGAAC TGTTCCAGCT GGCTCTTTAA TGGCGACTAA ATTAATTACA 


5880 



55 
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CAAGGTATCA TTTCAGTTTA CTTAGTAAGc TTCGCTAATT TTGGTACGGT TGGTATCATC 6 00 0 

GTAGGTTCAA TTAAAGGCAT TAGTGATAAA CAAGGAGAAA AAGTTGCATC CTTTGCAATG 6 06 0 

AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA ATCATTTCAG GATCAATCAT TGGCTTAGTA G12 0 

TTGTAAATGA ATCGAAGTAC CTAAATTAAA TTCATGGCAA AGCTAAACCC CGTCACCAAG 6180 

TTGGCGCAAC AGCGcATgcA TAACTTAGTG ACGGGGTTTT ATCATAACAA TCTACTTTTT 6240 

CGTAGCCGTT TTTGAAATGT ATGTTGATGG TTTATCTTTT TCAAAAATTG TTAATCCCGT 6300 

TATATCTTTT TTATGTTTTG AAGGGACAAT GAAGCTAAGT ATATAAGCAA AGACAAAAGC 6360 

AACTGTAAAT GAAATGGTAG ATACATAGAA AGGTGAGTTA CCTTTGCCAA CACCATTATA 6420 

GACATAAGCA AAGATGATAC CCAATATTAA TCCACAAATA ACACCGAATG TATTCGTACG 6480 

TTTAGTGAAA ATACCAACTG CAAATACACC AGCCAATGGA ACGCCGAATA ATCCAGTCAC 6 540 

AAACAAGAAT AAATCCCATA AGTCATTTGA ATTAGAAGCA ATTAAGTATA GTGACATTCC 6600 

AAAACCGAAA ATACCTGCAA TGATAATAAT GAAACGTGCA AAGTTAACTT CGTGTCGCTC 66 60 

GCTACCTTTT CCGAAGAAGC GTTGCTTAAT GTCGATTGAA ATACAAGCAG ATATAGAATT 6720 

TAAACTAGAT GAAATGGTAG ACTGTGCAGC GGCGAAAATG GCTGCAATAA GTAATCCTGC 67 80 

TACAAATGGT GGCATCTCAG TCAAAATGAA ATATGGCACT ACAGATGATG TATTGAAGCC 6 84 0 

TTTTGGTAAA ACAGCTTCAT GTGTATAAAA TGAATACAGC ATTGTACCCA TACCATAAAA 6 900 

TAAGGGTGCT GAAATTAAAG CTAGGATACC ATTTGTCCAT AACGATTTAT TTGTTTCTTT 6 96 0 

TAAACTATCA GAAGCTTGAT AACGCTGCAC GACGTCTTGA CTCGCTGTGT ATTGATACAA 7020 

GTTGTTGAAA ATATTTCCTA GGAAAATAAT TGGAATGGCA GCTGCCGCAG TATTTAGTTT 7080 

CCAATTGTCT GCACTAATTA ATTTTTTGTG CTCAATCGCA TCTGCAAAGA CAGTGCCGAA 7140 

ACCGCCTTTA ATGTTCACAA CACCTAGAAT AATAATAACT AAAGCGCCGC CTAATAAAAT 7200 

GACGCCTTGA ATGAAATCAC TCCAAACCAC ACCTTCGAAA CCACCTAAAA ATGTATATAA 7260 

AATACATAGT AAACCAACGA GTGATGCAAC GATATAAGGG TTCATGTCTG ATACAGATGT 7320 

GATTGCTAAT GTTGGTAAGT AGATAACAAT TGCAACACGC CCTAAATGGT AAACGACAAA 7330 

TAATAATGAG CCAATGACAC GTATGCTAGG GCCAAATCTA GCTTCTAAAT ATTCATATGC 744 0 

AGATGTTACC TTTAACTTTT TAAAGAAAGG GACATAGAAA TAAATAAGTA ATGGAATAAT 750 0 

TGCGACGATA GCAATGTTAC CAGCGATATA TGACCAATCT GTTAAAAATG CTTTCTCTGG 7 56 0 

TGTCGACATA AATGTAATCG CACTTAACGT AGTAGCATAA ATTGAAAAGC CAACTACCCA 7620 

AGATGGCAAG CGACCACTTG CGGTAAAGAA ACTATTGGTA CTTTGGCTCG CGCGCTTGGT 7680 
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TGTGCCAAAT CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 7800 

TGTCTATAAA TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 7860 

^ AGGTTTGAAA GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 7 92 0 

CAATGTTGGA TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 798 0 

CAGTTGGTAA GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 8 04 0 

10 

GCGACCATTA ACGTTATATG TAGAACCAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 8100 

AACTAACATT TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 8160 

TTCGAGTAGG AAGAAGTTTG GCGCTGTATA TTTAACACCA ACAATTTTTT CATGATTAAA 8220 

rs 

TAGCTCGCTG AATTGTTCAA TAGAAATATT CACACCTGTT AAATCTGGTA TTGCATAAAT 82 80 

AATCATATTG TTCTGAGTTG CTTOGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 8340 

AGTAAATGGA TAGTAGAATG GTGTTACGGC AGAAAGTGCA TCATAACCGA GTTCTGTGGC 84 00 

ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAACGAA CCTACTTGAG CAATCAATTT 84 60 

CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 852 0 

25 TAATAAAAAG TTTTCGCCTG AGCTACCATT TACATAAAGA CCGTCTAATT CTTCAGTTTC 8580 

AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 864 0 

AGGAACGAGT AACGCTGCAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 8 700 

30 AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 8760 

ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 882 0 

CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 8880 

TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 894 0 

GGACAACAAA AGTGAGCTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 90 00 

GCAATATCAG TTGATCCAAC CTGTCATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 906 0 

40 

AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 912 0 

GCGATTATTA AAATCACTGT CTCCTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 9180 

ACTAGGCGAA TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 9240 

TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 930 0 

TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 9360 

SO ACGTGCTGCA ACGAGTGCAT TGAAAAAGCG CATGATTGCC GGAGGATTTA CGAGAAGCAC 9420 

ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT GATATTGCAA AACAAATATT 9480 
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AGGGCTTATA TTAATTC3GGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 96 0 0 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 9660 

5 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 972 0 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 9780 
TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA cAGTGTGCCA CGGGCAAATT 934 0 

AAATTGAACyV AGCHSAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 
ATTAAAAAAG CACCTAGCAA CTCX3TTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 99 60 

GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 10080 

GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 1014 0 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 1032 0 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA TCATACGATA ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 10440 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 105 00 

30 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 106 80 

35 

AAGGCATCGC TATAACCATT TTGTCTTATA TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 10800 

AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT 10860 

ATCTTAATGA TATATTGTAA ATGACTTTAC GTGAAAAAAC GACTTATGGA GTGAGGAATA 10920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10980 

45 TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTTAGCAATT AAAGAAACGG TAGATTTACC AGTTATTGGC 1110 0 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTGCAACGTC AAAAGAAGTT 11160 

SO GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAGCAACGT 11220 

CCGAAAGAAA CGTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 11280 
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TATATTGGCA CGACGTTACA TGGCTATACT AGTTATACGC AAGGACAATT ACTTTATCAA 11400 

AATGACTTCC AATTTTTAAA AGATGTACTA CAAAGTGTTG ATGCAAAAGT TATTGCGGAA 11460 

GGTAATGTCA TTACACCGGA TATGTATAAA CGTGTGATGG ACTTAGGCGT TCATTGTTCA 11520 

GTCGTTGGTG GTGCGATAAC ACGACCAAAA GAAATTACGA AACGTTTTGT TCAAATTATG 11580 

GAAGATTAAA TGATAACGAT AAAAAAACGA GATGACCATC ATTAATTAAA GGCACCTAAT 1164 0 

TATCTTAGGT GGCTGAATGA ATGTAATGGG TTCATCTCGT TTTGTTTGTT TATGATAGTG 11700 

ATTTTATTTT CAACTTTATC CAAAAATAAG TAAAGCGACG GGGATGGTGA TTAATAGCGA 11760 

CAACGCCACG CGTAAAAACC AAATGATGAT GAGTTTCCAG ACAGGTATTT TAATTTCAGT 11820 

TGCTAGTATA CATGGCACTA ATGCTGAGAA AAAGATAATG GCTGATACGC TTACTACACC 11880 

GACGACAAAT TTAGTACTCA TTGCAGCTTT AGTTACTAAC AAAGATGGTA GAAACATCTC 11940 

TACAATAGAA AckCTGACGC TTTTGCTAGT AAAGCCTGAT CAGCAATTGG GAAAATATAA 12000 

ATAAATGGAT AGAAGATATA GCCAAGCCAA TCAATGAATG GTGTATAGTT CGCTACAATC 12060 

AGTCCTAAAA AACCAATCGA TAATATAGAA GGTAAAATAC CAACAGTCAT TTCTAAACCG 12120 

TCTTTCAAAT TGTCCCAAAC GTTCTTCACG AGAGATGGTG TTAATGCATT TTGTTTCATC 12180 

GCCTCTGCAT ATGCAGTTTT CAGTCTGCTT CCTTCAATAG CAACTTCTTG TTCTCCTTCT 12240 

TGTCCGTTAT AATATTCTGT TGATTCATTG CTGATTGGCG GTAGCCATGC AGTAATTGCA 1230 0 

GTCACGACAA ATGTGATGAC TAAAGTTATC CAAAAGTATA AATTCCAATG CGGCATTAAT 12 36 0 

CCTAAAGTTT TAGCAACGAT AATCATAAAA GTTGCTGAAA CTGTTGAAAA GCCAGTCGCA 1242 0 

ATAATCGTGG CTTCTCGTTT GTTGTACATC CCTTGCTTAT AGACACGATT AGTAATCAAT 124 80 

AATCCTAAGG AATAACTGCC GACAAACGAA GCCACTGCAT CGACAGCGGA TTTTCCTGGT 12540 

GTTTTAAAAA TAGGTCTCAT AATAGGCTCC ATATAAACAC CGACAAATTC TAATAAGCCA 12600 

TAGCCCACTA ATAAAGAAAG cGcAATTGCA CCTACTGGAA TTAAGATACT TAATGGCATC 12 660 

ATTAATTTTT CAAACAAAAA CGGACCATAG TTAGCTTTAA ATAGTATTGA TGGACCGATT 12720 

TTAAATACAT ACATTATACC GATCATTGCA CCTGCAACTT TAAATAATGT AATGACCAAG 12 780 

TTTGTGATTG AAGTCATAAA AGTACGTCTC ACTATTGGTA ACGCTGTACC AATTAAAATC 1284 0 

ATAATCAGTG CAACATAGGG CATAAGTGGA CCTATGATTG AGCGAATGGC TAGATGAACA 12 900 

TGATCGACGA AAATAGTGTT GTTACCATTA ATCGTAAAAG GAATAAAGAA ACATAGTATG 12960 

CCCACTAAAC TATAGACAAA AAAACGCCAT GCACTTGGTT GTTGTGCATT AGAATGATAT 13020 

TGATTCATTA AAGCAACCCC TTTGTTTAAA TGAATACACA AAACTGTATG ATGCATCTTC 13080 
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ATAGTTTGAA TTATTTTCAT ACCAATACAA ATTAACTAAT TATATATAGA TTGAAACTAT 132 00 

ATTACTTAAT AAAATATTTA TCTTAAATGT TGTTGTGTTG ATTCAACACC ACAACTAAAA 132 60 

^ GTGTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 13 32 0 

GTATTATTAA ATTTTTATTA ATTTTGTAGT CTTAATCmAA AAATAATATA TGTCATGTTA 133 8 0 

TATTGAAGGT GCAGTTGTTT TTCATTCTCA AGAGGGGGTC AAAAAAATAC TTTTGAGGTG 134 4 0 

w 

ATTATATGTT AAGAGGACAA GAAGAAAGAA AGTATAGTAT TAGAAAGTAT TCAATAGGCG 13 500 

TGGTGTCAGT GTTAGCGGCT ACAATGTTTG TTGTGTCATC ACATGAAGCA CAAGCCTCGG 13 560 

AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 13620 

ts 

GGAATGCGAT AACGTCACAT CAAATGCAGT CAGGAAAGCA ATTAGACGAT ATGCATAAAG 13680 

AGAATGGTAA AAGTGGAACA GTGACAGAAG GTAAAGATAC GCTTCAATCA TCGAAGCATC 13740 

AATCAACACA AAATAGTAAA ACAATCAGAA CGCAAAATGA TAATCAAGTA AAGCAAGATT 13 800 

CTGAACGACA AGGTTCTAAA CAGTCACACC AAAATAATGC GACTAATAAT ACTGAACGTC 13860 

AAAATGATCA GGTTCAAAAT ACCCATCATG CTGAACGTAA TGGATCACAA TCGACAACGT 13920 

2S CACAATCGAA TGATGTTGAT AAATCACAAC CATCCATTCC GGCACAAAAG GTAATACCCA 13980 

ATCATGATAA AGCAGCACCA ACTTCAACTA CACCCCCGTC TAATGATAAA ACTGCACCTA 14040 

AATCAACAAA AGCACAAGAT GCAACCACGG ACAAACATCC AAATCAACAA GATACACATC 1410 0 

30 AACCTGCGCA TCAAATCATA GATGCAAAGC AAGATGATAC TGTTCGCCAA AGTGAACAGA 14160 

AACCACAAGT TGGCGATTTA AGTAAACATA TCGATGGTCA AAATTCCCCA GAGAAACCGA 14 220 

CAGATAAAAA TACTGATaAT AAACAACTAA TCAAAGATGC GCTTCAAGCG CCTAAAACAC 14230 

GTTCGACTAC AAATGCAGCA GCAGATGCTA AAAAGGTTCG ACCACTTAAA GCGAATCAAG 14340 

TACAACCACT TAACAAATAT CCAGTTGTTT TTGTACATGG ATTTTTAGGA TTAGTAGGCG 144 00 

ATAATGCACC TGCTTTATAT CCAAATTATT GGGGTGGAAA TAAATTTAAA GTTATCGAAG 14460 

40 

AATTGAGAAA GCAAGGCTAT AATGTACATC AAGCAAGTGT AAGTGCATTT GGTAGTAACT 14520 

ATGATCGCGC TGTAGAACTT TATTATTACA TTAAAGGTGG TCGCGTAGAT TATGGCGCAG 14580 

CACATGCAGC TAAATACGGA CATGAGCGCT ATGGTAAGAC TTATAAAGGA ATCATGCCTA 14640 

ATTGGGAACC TGGTAAAAAG GTACATCTTG TAGGGCATAG TATGGGTGGT C7UUVCAATTC 14700 

GTTTAATGGA AGAGTTTTTA AGAAATGGTA ACAAAGAAGA AATTGCCTAT CATAAAGCGC 14760 

50 ATGGTGGAGA AATATCACCA TTATTCACTG GTGGTCATAA CAATATGGTT GCATCAATCA 14 820 

CAACATTAGC AACACCACAT AATGGTTCAC AAGCAGCTGA TAAGTTTGGA AATAC7VGAAG 14 880 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 1512 0 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 1524 0 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AGTAGCATCA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 1584 0 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

TTCGCrrrCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 1602 0 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16080 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16380 

ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 1644 0 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 1656 0 

AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA ACAATATCTT TAAAAGCAGC ATGTGGAATG GCTAAATCTT CTAAATCTGC 168 00 

CATAGAAAAT TCAAGATTGA TATCATGTGG TCGCTGTTCA GCAAGTTTAT GCACAAAGTC 1686 0 

S 

AGGTTCTGTG ACAAAAGGCG AAGACATGCC GACCATATCT GCATGTTGTA AAGCATCTAA 16920 

AGCAGACTCT GGAGAATTAA TCCCGCCACT TGCAATTAAA GGGATACGAC CTGCTAAATG 16980 

TTCATAGACA ATTTGGTTAA CTGGTCGACC GAAATGATCA CCTGGTGTAC GAGACGTATT 1704 0 

TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACGTCCAT 17100 

GACCCAATTG ATTAATTGGT TGAACTCGTC AATGGTATAT CCTAAATCAC TGCCTCTGGT 17160 

,5 TTCTTCTGGC GTTGCTCGAA ATCCTAAAAT AAAATTGTCA GGTGCTTCTT TATCAATCAC 17220 

TTCTTGTACC GCACGCATAA CTTCTAAACA TAATCTTGCA CGATTTTTTA ATGAGTCGGC 17280 

ACCGTAATGG TCTGTACGTT TATTCGAAAA AGTTGAGAAA AATGTTTGAA TCAGCAAACG 17340 

20 TTGTGCAATC GAAATTTCCA CACCATCAAA ACCTGCTTTA ATCGCGCGTA ATGTAGCATC 17400 

GCGATACTGC TGAATGATGC TATTGATTTT CTCATGAGAC ATGGCGATAA CATCGTGTTC 17460 

AATCGGTGAA TGCAATGTCA TAGGGCTTGG TCCATACACC TTTCCAAAAT TTAAAATGGC 17520 

TTGATTTGAA AAACGACCAG CATGCGCTAg CTGGATAATA GCGAGGCTAC CATGTTGTTT 17580 

CATCGTAGAT GCCATGTTAG TTAATCCAGG GATACAAGCA TCATGATCAA TATTAAAGCC 17640 

ATATTCAAAC AATTGACCAT AAGGTTCAAT GTAAGCAGCG CCGGTGACTT GCATTCCAGC 17700 

30 

TGAATTAGAG CGACGTGCAG CATAAGCCAA GTCTTCTTTT GTAATATAGC CTTCTTTTGT 17760 

TGATGTGTTT ACGGTCATTG GTGATAATAC AAAGCGATTC GAAATTTTGA TGCCATTAGG 17 820 

TAAGTGGATT GATTGTAAAA GTGGTTTGTA TCGGTACATA CTATGATTCC TTTTCTATTC 17880 

35 

AATATTGTTT TCAAAGTACC ATGGAAAGAA TGAATAATCA ATGATGAACA GTCTTGATAG 17 940 

AATAGAATTG GTACATGGAA AGTATTTTTA AAATTAAACT AATGAATGGC ATTTGTAGGT 18000 

CTGAAAATAT GAATATGAAA AAGAAAAATA AAGGCGAAAA GATATAAAAG TTAATTGAAA 18060 

AACGTTATCA TATACGTGGG TATATGAAGA GGGAATGGTA TTAAGAACGC TAAAATGTTA 18120 

TGTCGGTTTG ACATGACAGG ATAAGTTTGG AGATGACGGA TTGGTTAAAT TAAGCGTATT 18180 

45 AGACTATGCC TTAATAGATG AAGGTAAGGA TGCACAAAAG GCATTGCAAG ATTCAGTGAC 18240 

ACTTGCAAAA TTAGCAGATC GACTTGGCTT TAAGCGAATT TGGTTTACGG AACATCATAA 18300 

TGTACCAGCG TTTGCGTGTA GTAGTCCAGA ACTTTTGATG ATGCATACAT TGGCGCAGAC 18360 

*° AAATCACATA CGAGTTGGCT CTGGTGGTGT GATGCTGCCG CACTATCGAC CTTATAAAAT 18420 

TGCTGAGCAT TTTAGAATGA TGGCAGCGTT ATATCCAAAT CGTATTGATT TAGGTATTGG 18480 
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TAGTTACGAT GAATCGATTT CCHTATTACG TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGCGCATACG TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT GGTTATTAAG 18 6G0 

^ TAGTAGCGCA ACATCTGCCA AAATAGCTGC CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

ATTTTTGCTA CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG ATATTTACAA 18780 

AAAACATTTC CAAGCATCAA CGATTAAAAT GGACGCAAAG GTGATGGCAT CTGTATTTGT 18840 

to 

CATTGTAGCT GATAACGAAG CGGAAGTAGC AGCATTACAA CATGCCTTAG ATGTTTGGTT 13900 

ATTAGGTAAA TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA CAGCACAAAA 18 960 

GTATAAGCTT AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

/s 

AGGTACACAA GAAAAGGTTA AAGCACAATT AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TGAGGTGTTA GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA CATTAAAATT 1914 0 

ACTCGCGGAA ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

ACCTGAATTG CAAGATGATA TTGGGACAGT AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

25 AGTTAAAGTG GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAACGGTCA TTGAT3TGCA 19380 

AACGCCATTG TCAGGAACGA TTATTGAGCG AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

TTTAAACTCT GAAAAACCAG AAGAAAATTG GTTGTTCAAA TTGGATGATG TCGATAAAGA 19 500 

^° AGCATTCCTA GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

TGAATATTTA ATCAATGATA TGCATCGAGA GAGAAATGAC AATGACGTAT TGGTAATGCC 19 520 

ATCTTCATTT GAAGATTTGT GGGAATTATA TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

ACCTGTAAGT GATGAATATT TAGCTGTACA AGATGCTATG TTAAGTGATT TGAATCGTCA 1974 0 

ACATGTTACG GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

AGGfGATATC ACGACGTTAA AAATCGATGC TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

40 

AGGATGTATG CAAGCTAATC ATGACTGCAT TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCAAGTTCGA CTTGATTGTG CAGAGATCAT TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

T/^GCCAAA ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC ATACGGTTGG 20040 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG CTAAATGTTA 20100 

TCTTAGCTGT CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGTCGCTT TTTGCTGTAT 20160 

50 ATCTACAGGT GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG TTCGAACAGT 2 0220 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTTCAATG TATTTACAGA 2 0280 
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CAATGTCTCT GTTAATGGAT 
ATGAAGCAGA TGCGATAGTG 
CATATGTAGG AGAGCGTTTT 
TTGATATGTT GCAAGCGAGT 
AGAGTCGTTT TATTACATTA 
TAAAATCCTT GGTGGAAGGT 
TCGATGTAGC TGATTATGAT 
AACAGTGTAG CTCAGCATTG 
ATGGTTGTTG CGCAACAAGA 
TGTGATGCCC CAATGGAAGT 
GAATTTCATG CGCAACTACA 
GTGTTGTATT TGGAAATTGG 
CAGCGTATGA CACGTAAAAA 
CGCATTCCGA ATTCAATTCA 
ATTACAGCAG CACTCCGGAA 
GATGTACTTA ATAGAACCGA 
CGCTATGCAA GTTTATGTTA 
TTATTGTGAT CCAAAAGTGG 
AGATTATATA GATAAACACA 
GTATGTTGAT AAAGGTGCCG 
tggtSatttt CAACGATTTT 
AGAT" GTGGTA CAAAGCXJGTA 
CGCAATGACA TTAATGAATA 
TTATGAAGCA ATGGATAAAG 
TAAATCTGTG CGCGCACGTG 
TATAACCATT GAAGAATTTA 
TAAAGAGGCG AAACGATATG 
TGATAAAAAG TATAAAAATT 
TCGAAGTGAA AGATTATCTT 



GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 2 0400 

ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 2 052 0 

TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 2 0580 

AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 2 0640 

AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 2 0700 

ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 2 0760 

TCATGCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 2 0820 

TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 2 0880 

GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 213 00 

ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 21360 

AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 2142 0 

GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 8 0 

TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 21780 

TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 220 80 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 222 00 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 222 60 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 2 23 20 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 223 8 0 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22 500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAQACA 22560 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 22680 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 2274 0 

CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAG/^CATGC 22920 

GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 2 3100 

TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 23160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 2 3220 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 2 3280 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACitiAATTwA 23340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 23400 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 23439 
(2) INFORMATION FOR SEQ ID NO : 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39: 
CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 
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TATTATGCAG TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 

TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 24 0 

^ AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 3 00 

GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 3 60 

CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 4 20 

w 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 4 80 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 54 0 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 600 

IS 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 660 

TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 720 

AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 780 

TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 840 

TAAATTAACT GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 

2S TTATAAATTA AATGCTGATC TATCACCGAA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 960 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 1020 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 108 0 

30 AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 114 0 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 120 0 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 12 6 0 

GGTAGTGAAA ATATGAGCTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 1320 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 13 80 

GACTCATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CXiAGGGGAAA ATTCCGTTTT 1440 

CCTATGAGCG TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 1560 

CGACATCGCC GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 1680 

AAGCCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 174 0 

50 ACACATAATA TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 1800 

CACAAAGAAA GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGaCAC AATGAAACCT 1860 
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AAACAAAAAT ATGATAAATA TGTAGCTAAG ACGCAAACGT CTCAAAATAA ACAATTAGAA 19 8 0 

CAAGAAAAAC AAAATGATAG TGTTGTCAAA CAAGGAACTG CATCTAAATC ATCTGATGAA 204 0 

S AATGTATCAT CAACAACAAA ATCAATGCCT AATTATTCAA AAGTTGATAA TACTATCAAA 2100 

ATTGAAAATA TTTATGCTTC ACAAATTGTT GAAGAAATTA GACGTGAACG AGAACGTAAA 216 0 

GTGCTTCAAA AGCGTCGATT TAAAAAAGCG TTGCAACAAA AGCGTGAAGA ACATAAAAAC 2 2 20 

GAAGAGCAAG ATGCAATACA ACGTGCAATT GATGAAATGT ATGCTAAACA AGcGGAACgC 2 2 80 

TATGTTGGTG ATAGTTCATT AAATGATGAT AGTGACTTAA CAGATAATAG TACAGATGCT 234 0 

AGTCAGCTTC ATACAAATGG CATAGAGAAT GAAACTGTAT CAAATGATGA AAATAAACAA 24 00 

J5 

GCGTCAATAC AAAATGAAGA CACTAATGAC ACTCATGTAG ATGAAAGTCC ATACAATTAT 24 60 

GAGGAAGTTA GTTTGAaTCA AGTATCGACA ACAAAACAAT TGTCAGATGA TGAAGTTACG 2520 

GTTTCGAATG TAACX3TCTCA ACATCAATCA GCACTACAAC ATAACGTTGA AGTAAATGAT 2 580 

20 

AAAGATGAAC TAAAAAATCA ATCCAGATTA ATTGCTGATT CAGAAGAAGA TGGAGCAACG 264 0 

aATAAAGAAG AATATTCAGk AAGTCAAATC GATGATGCAG AATTTTATGA ATTAAATGAT 2700 

ACAGAAGTAG ATGAGGATAC TACTTCAAAT ATCGAAGATA ATACCAATAG AAACGCGTCT 2760 

GAAATGCATG TAGACGCTCC TAAAACGCAA GAGTACGCAG TAACTGAATC TCAAGTAAAT 2 820 

AATATCGATA AAACGGTTGA TAATGAAATT GAATTAGCAC CGCGTCATAA AAAAGATGAC 2 8 80 

30 CAAACAAACT TAAGTGTCAA CTCATTGAAA ACGAATGATG TGAATGATAA TCATGTTGTG 294 0 

GAAGATTCAA GCATGAATGA AATAGAAAAG AATAACGCAG AAATTACAGA AAATGTGCAA 3 000 

AACGAAGCAG CTGAAAGTGA ACAAAATGTC GAAGAGAAAA CTATTGAAAA CGTAAATCCA 306 0 

35 AAGAAACAGA CTGAAAAGGT TTCAACTTTA AGTAAAAGAC CATTTAATGT TGTCATGACG 3120 

CCATCTGATA AAAAGCGTAT GATGGATCGT AAAAAGCATT CAAAAGTCAA TGTGCCTGAA 3180 

TTAAAGCCTG TACAAAGTAA GCAAGCTGTG AGTGAAAGAA TGCCTGCGAG TCAAGCCACA 324 0 

CCATCATCAA GATCTGATTC ACAAGAGTCA AATACAAATG CATATAAAAC AAATAATATG 3300 

ACATCAAACA ATGTTGaGAA CAATCAACTT ATTGGTCATG CAGAAACAGA AAATGATTAT 336 0 

CAAAATGCAC AACAATATTC AGAGCAGAAA CCTTCTGTTG aTTCAACTCA AACGGAAATA 342 0 

45 

TTTGAAGAAA GTCAAGATGA TAATCAATTG GAAAATGAGC AAGTTGATCA ATCAACTTCX3 3480 

TCTTCAGTTT CAGAAGTAAG CGACATAACT GAAGAAAGCG AAGAAACAAC ACATCCAAAC 3 54 0 

AATACTAGTG GACAACAAGA TAATGATGAT CAACAAAAAG ATTTACAGTC ATCATTTTCA 3600 

50 

AATAAAAATG AAGATACAGC TAATGAAAAT AGACCTCGGA CGAACCAACA AGATGTTGCA 3660 
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CCAAGTGTTT CATTACTAGA AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 37 3 0 

GATAAAAAGA AAGAACTGAA TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 3 84 0 

GATGTAACTG AAGGTCCAAG TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3 900 

GTTTCAAGAA TTACGGCATT ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 3 96 0 

CGTATAGAAG CGCCTATTCC AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4 02 0 

CCAACGACAG TCAACTTACG TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4080 

AAATTAACAG TTGCGATGGG GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 4140 

AAAACGCCAC ACGCACTAAT TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 4200 

AGTATTTTGA TGTCTTTACT ATATAAAAAT CATCCTQAGG AATTAAGATT ATTACTTATC 4 260 

GATCCAAAAA TGGTTGAATT AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 4 320 

ATTACAGATG TCAAAGCAGC TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4 3 80 

CGTTATAAGT TATTTGCACA TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 444 0 

CCCATATGAT GAAAGAATGn CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4 500 

GATGATGGTC CGCAAGAAGT TG 4 522 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
TCAAGTTTAC GGATACGTAT ATATTTTGCA TGACATTTAG TGCAATAATA TTCATAATTT 60 
GCCCGTTGTT GATAGCTTTC AATGCTGTTA CAAAATCTAG GCGCTCCAAC CTGTTGGCTC 12 0 

AATCGTTTAA AATCTTGATC TTTATGTTGA TAACCTTTAC CAGCAATATG CAAGTGATAA 180 

TGACACAATT CGTGCAGTAT AATTTTTACA ACAGCATCTT CTCCATAATG CTCATATTGT 24 0 

TTTGGATTAA TTTCAATATC ATGGGACTTT AAAAGATAAC GTCCGCCTGT TGTACGTAAC 300 

CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA ATTTTTCTTC CGAAAGATTC 360 

TCAACCATTC GCTGAAGTTT GTCATTATTC ATGTGGATCA ATCATCGTTA ATGATACTTT 4 20 

GTCTTTATTT TTGTCAATAC TGTAAATCCA AACGTCAACG ATATCACCAA CACTGACAAT 480 

ATCCATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA ACATGGACAA GTCCATCTTG 54 0 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 66 0 

AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 72 0 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 7 51 
(2) INFORMATION FOR SEQ ID NO: 41; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACGCGACCA 60 

ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 12 0 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 180 

GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATTAGCAATT 240 

AATTGCACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 3 00 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 360 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GACCATATCT 4 20 

TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 30 

GTAAACACGC CTTCAAATAC AAGTTGCTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTTAATTG CTTCTTTCAA CCACTGTTTA 660 

GACGgAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 720 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 78 0 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTITCCATTA AATGACGTGC TACTTTAACA 84 0 

CTACCTAGTC CATAGGCATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 900 

GTACTGAATA CTTTGAAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACGCAGAT 96 0 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 102 0 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 107 6 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 12 0 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 24 0 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 4 80 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 7 20 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 7 80 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 840 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 1140 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 1380 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 1440 

GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 1560 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 16 80 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

5 GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 180 0 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 18 6 0 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 192 0 

'° GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 198 0 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2 04 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

IS 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 2160 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2 22 0 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 2280 

20 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 234 0 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 2400 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 2460 

25 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2 52 0 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 2580 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 2640 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACXJATTTG ATAAAAAGTG 2700 

AGGTAACTAT ATATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 276 0 

35 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2 820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 2B80 

AGACCTAGAG GTGTATTACX5 TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2930 

40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

lA) LENGTH: 3 606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQDENCE DESCRIPTION; SEQ ID NO: 43; 

^° CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 60 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT 180 

ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA TTTTTGGCAA TAACGAAGGA 24 0 

5 GTATTTATGA ATAAAGACAA GCAATTGCAC AACGACAAAA TCAATCTATC CCAATTAGTC 3 00 

TTATTAGGGT TAGGCTCTTT AATAGGATCT GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 3 60 

TCAATAGCTG GACCAGCAGC AATCATATCA TGGGTTCTTG GATTCCTAGT CATTGGAACC 420 

ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

TATGCCCAGT ATACACATGG CTCATTATTA GGCTTTATTG CTGCTTGGGC GAATTGGGTG 54 0 

TCTTTGGTGA CAATAATACC TATCGAAGCT GTGTCAGCTG TTCAATATAT GAGTTCTTGG 6 00 

CCGTGGCATT GGGCGAAACC AATGAGATAT TTAATGGAAA ATGGCTCTAT TAGCACATAC 650 

GGATTGCTAG CTGTATATCT CATCATTGTT ATTTTTTCAT TATTAAACTA TTGGTCCGTA 720 

AAACTTTTAA CATCATTTAC GAGTTTAATT TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 

20 

ACCATCATCA TGTTGATGCT ATCAGGATTC GACACTTCAA ATTACGGCCA TTCGGCAAGC 84 0 

ACATTTATGC CTTACXSGAAG TGCACCGATT TTTGCTGCAA CAACAGCATC AGGGATTATT 900 

TTTTCATTCA ATTCATTCCA GACAATTATT AATATGGGTT CAGAAATTAA AAATCCTGAA 96 0 

25 

AAAAATATCG CAAGAGGCAT CGCTATCTCA CTGTCAATCA GTGCAGTGTT GTACATCATT 1020 

TTACAAAGTA CGTTTATCAC TTCTATGCCT CAATCAATGT TACAACATAG TGGATGGAAT 1080 

GGCATCAACT TCAATTCACC ATTTGCTGAT TTAGCTATCT TATTAGGAAT TAATTGGCTC 1140 

30 

GCAATTTTAC TATACATTGA AGCTTTTGTA TCACC7VTTCG GTACTGGCGT GTCATTTGTC 1200 

GCCGTTACAG GTCGAGTTTT ACGAGCAATG GAGAAAAATG GACATATCCC TAAATTTCTT 1260 

35 GGGAAGATGA ATGAAAAATA TCATATCCCA CGTGTAGCAA TCATCTTTAA TGCCATCATT 1320 

AGTXtGATTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT 1380 

GCAACTTTAG TAGCCTATTT AACTGGCCCA ACGACAGTGA TTGCATTAAG AAAAATGGGA 144 0 

40 CCAACAATGA CTCGTCCATT TAGAGCAAAA ATTTTAAAAG TAATGGCACC ATTATCATTT 1500 

GTATTAGCTT CATTAGCTAT ATATTGGGCA ATGTGGCCAA CAACGGCTGA AGTTATTTTA 156 0 

ATCATTATAC TTGGATTACC AATCTACTTC TTCTATGAAT ATCGTATGAA TTGGCGTAAT 16 2 0 

ACAAAGAAAC AAATTGGTGG TAGCTTATGG ATTATTGTAT ATTTAATCGT GCTATCAATA 16 BO 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA GGCTTAAATA TGATTCACTA TCCATTTGAC 174 0 

TTTATCGTTA TTATTATTGT GGCACTTATC TTCTATTACA TCGGTACAAC GAGTTCATTT 1800 

^ GAAAGCGTCT ATTTCCGTCG CGCAACACGA ATCAATACGA AGATGCGTGA GTCACTAAAT 1860 
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CACACACATT AACCAACCAT TGATTTCAAC ATCTTGGTTG GTTTTTTATT TTGAAAATCG 1980 

GTTATAAATA ACTAACATAA CAAGATGATG ATCAGGCTGG GACATAAATC AATGTTCTAT 2040 

GCTCTACGAA gTTATATTGG CAGTAGTTGA CTGAACGAAA ATGCGCTTGT AACAAGCTTT 2100 

TTTCGATTCT AGTCAGGGGC CCCAACACAG AGAATTTCGA AAAGAAATTC TACAGGCAAT 2160 

GCAAGTTGGG GTGGGACGAC GATAAAGAAA TACTTTTTCT ATAGAAATTA GTATytCTTA 222 0 

TGCATGAGTT TTACTCATGT ATTCATATTT TTAAGTACAC ATTAGCTGTG GCTAATGTAT 2 2 80 

AAGAACCACT ACATAATAAA TCATTTGTGG CTCTTTATCA TTTCTGTCCC ACTCCCGTAG 2 34 0 

AAGTACATCA TATAATGCTG AAAATGGTTT GAGTTAAAAC AGATATCAAG CTCGTCTGAT 2400 

TCAGTCACAA AATTGTCTTG TTATACTTGT CACCTATCAT CTATAGACCG TGGTATGATT 24 60 

AAATTGGGGA TGATAAAGGA GGTTAATAAA TATGAAGATT AATACTACAG GTGGTCAAAT 2 52 0 

TCATGGTATT ACACAAGATG GTTTAGATAT CTTCTTAGGC ATTCCTTATG CAGAACCACC 2580 

AGTTCATGAC AATCGCTTTA AACATTCTAC GTTAAAAACA CAATGGTCAG AGCCAATTGA 264 0 

TGCAACTGAA ATACAACCCA TCCCACCGCA ACCAGACAAC AAATTAGAAG ATTTTTTCTC 2700 

CTCACAATCT ACAACTTTTA CTGAACATGA AGACTGTTTA TATCTAAATA TTTGGAAACA 276 0 

ACATAATGAT CAGACGAAGA AACCTGTCAT CATTTATTTT TATGGTGGTA GTTTTGAAAA 2 82 0 

TGGTCATGGT ACAGCCGAAC TCTATCAACC GGCACATTTA GTACAAAATA ACGACATTAT 2880 

CGTTATTACA TGCAATTATC GTTTAGGCGC ATTAGGATAT TTAGACTGGT CATATTTTAA 2940 

TAAAGATTTT CATTCCAATA ATGGCCTTTC AGATCAAATC AATGTCATAA AATGGGTGCA 3000 

TCAATTTATT GAATCCTTCG GTGGCGACGC TAATAACATT ACTTTAATGG GTCAGTCTGC 306 0 

AGGCAGTATG AGCATTTTGA CTTTACTTAA AATACCTGAC ATTGAGCCAT ACTTCCATAA 312 0 

AGTQGTTCTA CTAAGTGGCG CACTACGATT AGACACCCTT GAGAGTGCAC GCAATAAAGC 3180 

ACAACATTTC CAAAAAATGA TGCTCGATTA TTTAGATACA GATGATGTTA CATCATTATC 3240 

GACAAATGAT ATTCTTATGC TGATGGCGAA gcTAAAACAA TCTCGAGGAC CTTCTAAAGG 3 300 

GCTTGATTTA ATATATGCGC CTATTAAAAC AGATTATATA CAAAATAATT ATCCAACAAC 3360 

GAAACCAATT TTTGCATGTT ATACAAAAGA TGAAGGCGAT ATTTATATTA CTAGTGAACA 3420 

GAAAAAATTA TCGCCGCAAC GCTTTATCGA CATTATGGAA TTAAATGATA TTCCTTTAAA 34 80 

ATACGAAGAT GTTCAGACGG CGAAGcAACA ATCTTTAGCG ATTACACATT GTTATTTCaA 3 54 0 

ACAGCCGATG aAGCAATTTT TACmACmACT CAATATACmA GATTCCAACC GCACCAACTA 3 6 00 

TGGCTT 3606 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
^ (0) TOPOLOGY: linear 

(xil SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACaAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT IBO 

15 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 240 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 3 60 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 4 80 

TATAGCTAGA AAGTTAGATA TTTGTATTTT TTTAAATAAT AAGTGCCGTT GTTATCGTTC 54 0 

25 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 6 60 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAGGTTTA AAAGGATTAC AACAGAATGT 720 

30 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 78 0 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 840 

35 GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTCAGTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACGCATTGCC AATTTTAAAT 1080 

GAGCCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 126 0 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 132 0 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG 1380 

ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 
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GTACCACCAT TAACACACTT TATAACAAAT TCTACTTACT TTTTATGGGA TGTTTTAACG 1560 

ACTGCTTATA TTGGTAACAA GGACTTGGTT CATTCAATTG AGAAAAAAGT CGATGTAATA 1620 

AGTTATGGAC CAAGTCAAGG TAAGAC7VTTT GAGTGTAAAG ATGGGCGCAA AATTAATGTC 16 BO 

ATAAATCATG TAGATAACAA CGCATTTTTT GATTATATAA CTGCACTTGC TAAAAAAGTA 1740 

AATTAACAGC TGTGTAGAAT AATTAAGGTT TTAATTTATA TAGAACAACT TATTGTAAAC 1800 

TTTTCATTTC TTAAAGTTTA CAATGGTGCT ATAATAATGG TCATGAAATA CGAAAGGAAG 1860 

TAAAAAATGA CAACAAAACA GTTAGTATAT ACAGCTTTAA TGACAGCGAT TATCGCTATT 1920 

TTAGGATTGG TACCGGTAAT TCCACTACCA TTTTCTTCAG TACCAATTGT ACTTCAAAAC 198 0 

ATTGGTATTT TCTTAGCAGG TGCGATTTTA GGACGTAAAT ATGGCACATT AAGTGTTATC 2 04 0 

GTCTTTTTAT TATTAGTAGT TGCTGGCTTG CCATTGTTAT CAGGTGGTCG CGGTGGCATC 2100 

GGTGTATTCG CAGGTCCTTC AGCAGGGTTT TTACTATTAT ATCCAGTTGT AGCATTCATG 2160 

ATTGGGGCGA TTCGAGATAG ATTCATCAAT GAAATTAATT TCTGGATTTT ATTCGTTGGT 222 0 

ATTTTAGTTT TTGGTGTTAT AGCATTAGAT GTTATTGGTA CATTGATTAT GGGCATGATT 2280 

ATTAACATAC CATTTACGAA AGCTATTTCA ATTTCATTAG CTTATTTGCC TGGTGATATA 2340 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT ACAGCTTTAC TTAATCACTC GCAGTTTCGT 2400 

CAAATTATGG GAATAAAATA ATCATATTTA AGATAGTAAA GTAATTGAAT AAGTTGCTTT 246 0 

GAAATTTATA AAAGTGAAAG GAGTAGGTGT CAATGGCTAG TATAAGTATG TCAGATATAT 2520 

ATTGTAACGG CACTATATTT GAAAATGACG ACGAGCAGTT GATTTATTTA ACGCCTTCTT 2580 

TTCCACAACG ATACACAAGT AACACATGGA TATATAAAAA GACGCCTACC CAAGAGCGAT 2640 

GGCTGAAAGA CTTAGAACGT CAACATCAAT TACATACAAA TCAAGGTTCA AATCATTATG 2700 

CGTTTAGTTT CCCGGAAAAT GAACAACTTG ATAATCATTG GATGGCTATG TTTAAAGATA 2760 

TGAATTTTGA ACTAGGTATT ATGGAATTGT ATGCCATAGA AAGTGATGCG CTTGCCAATT 2820 

TGCCGCGTAA CTCTGACGTT GAAATTGCCA TCGTTGACGA GTCGCATATA GATGCCTATT 2880 

TAAAAGTTGC ATATCAGTTT AGTTTGCCAT TTGGAAAAGA CTATGCAGAT GCACATGAAG 2940 

AAATGGTAAG GGAACATTAT CAAAAAGATG TGATTAAACG CTTAGTAGCT TATTTAAATA 3 000 

ATGAACCTAT TGGCGTTGTA GATGTCATTG AAAGTGAAAA TTACATTGAA TTAGATGGAT 3 060 

TTGGTGTATT AGAACAATTT CGGCACCAAG GAATTGGATC TACAATTCAA TCGTTGATAG 3120 

GTGAATACGC CATATCAAAA AATCACAAAC CAATCATATT AGTTGCAGAT GGTGAAGATA 3180 

CAGCAAAAGA TATGTATGCA AAGCAAGGTT ATGTCTATCA ATCGTTTTGT TATCAAATAT 3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTGA GCTACTTATA 3 3 60 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 3 4 20 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 3480 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 354 0 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 3600 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 3 660 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGTCTTGT 3720 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 3 780 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 3 84 0 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3 900 

CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4 020 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 4 080 

AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 414 0 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 42 60 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4 3 20 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 43 80 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 4440 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 500 

TTAAftAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4 560 

CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 468 0 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4 74 0 

TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 80 0 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 920 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4980 

GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

5 AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 534 0 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 54 00 

'° ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 54 60 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 558 0 

IS 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 564 0 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

25 TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

30 AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 6240 

35 

CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6 300 

TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

40 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

^5 ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6720 

SO CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6 780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6 840 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

^ CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 7 03 0 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 7140 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 72 00 

10 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTTATACCT GGTAAAACAC 7260 

CTTGCTTTAA CTGTTTGGTA CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 7320 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 7380 

IS 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 744 0 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

go CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 7560 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 7680 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACGCACA TCAGATGCCA 774 0 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 780 0 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 78 60 

^° CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7 920 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7980 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCATCATT ACGACTGGTG 8040 

35 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAIAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 82 80 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 834 0 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 8400 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 84 60 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

SO ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 8580 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 8640 
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AATGCTTGAA TGAGCGACAG CAGTTCTTTT TGTAATTTGT TTGTCTGATA CATCGACCAT 8760 

TTTGGCGTGG CCTTGTTGAT TAATATGAGT AAACTCAGTC ATTTTACCCC TCCTAGTGCA 882 0 

^ TCTAGTATAT CATGAAAAAA TAAAAGTTTT GGAGATGATT TTTAATGGTA GTAGAAAAAA 88 80 

GAAACCCAAT CCCAGTTAAA GAAGCAATTC AACGTATCGT TAATCAGCAG AGTTCAATGC 894 0 

CGGCAATTAC GGTAGCACTT GAAAAAAGTC TAAATCATAT CTTAGCAGAA GATATTGTAG 9000 

10 

CTACTTATGA TATACCAAGG TTTGATAAAT CACCTTATGA TGGTTTTGCA ATTCGCAGTG 9060 

TTGATTCACA AGGGGCAAGT GGTCAGAATC GCATTGAGTT TAAAGTGATT GATCATATTG 9120 

GTGCAGGTTC AGTTTCTGAT AAATTAGTTG GGGATCACGA AGCGGTGCGT ATTATGACTG 9180 

GAGCACAAAT ACCTAATGGC GCAGATGCTG TTGTTATGTT TGAACAAACG ATTGAACTAG 924 0 

AAGATACATT TACAATTCGT AAACCATTTT CAAAAAATGA AAATATATCT TTAAAAGGTG 9300 

20 AAGAAACAAA GACAGGCGAT GTTGTTCTAA AAAAAGGACA AGTAATTAAT CCAGGGGCTA 93 60 

TCGCGGTCCr TGCAACATAT GGCTATGCAG AGGTTAAAGT TATTAAGCAA CCGAGTGTCG 9420 

CTGTTATTGC AACAGGAAGC GAATTATTAG ATGTTAATGA TGTATTAGAA GATGGGAAAA 94 80 

25 TTCGTAACTC TAATGGCCCA ATGATTCGTG CCTTAGCAGA AAAATTAGGT CTTGAAGTTG 954 0 

GTATTTACAA AACACAAAAA GATGATTTAG ATAGTGGCAT CCAAGTCGTT AAAGAAGCTA 9600 

TGGAAAAACA TGATATCGTT ATTACAACGG GCGGAGTTTC TGTTGGAGAT TTTGACTATT 9660 

^° TACCTGAGAT TTATAAGGCT GTAAAGGCGG AAGTGTTATT TAATAAAGTA GCAATGCGTC 9720 

CTGGTAGCGT AACAACGGTT GCATTTGTAG ATGGaAAGTA TTTGTTTGGa TTATCTGGAA 97 80 

ATCCATCAGC TTGTTTTACA GGATTTGAAC TATTTGTGAA nCCAGCTGTT AAACATATGT 984 0 

35 

GTGGCGCACT AGAAGTCTTC CCGCAAATAA TTAAAGCAAC ATTAATGGAA GATTTTACCA 9900 

AGGO^CCC ATTCACACGA TTTATACGTG CTAAAGCAAC GTTAACAAGT GCTGGAGCTA 9960 

CTGTAGTACC TTCAGGATTC AATAAATCAG GTGCGGTTGT AGCGATTGCA CATGCTAACT 10020 

40 

GTATGGTCAT GTTACCAGGA GGGTCACGTG GTTTTAAAGC GGGGCATACA GTAGATATTA 1008 0 

TATTGACTGA ATCTGACGCT GCTGAAGAGG AACTTCTTTT ATGATTTTAC AAATTGTAGG 1014 0 

45 TTACAAAAAG TCTGGTAAGA CAACATTGAT GAGGCATATT GTCTCTTTCT TAAAGTCACA 10200 

TGGTTATACA GTTGCTACTA TTAAACATCA TGGGCATGGT AAGGAAGATA TTCAATTACA 10260 

GGATTCAGAC GTCGATCACA TGAAGCATTT TGAAGCGGGG GCAGATCAAA GTATTGTACA 10320 

50 AGGTTTTCAA TATCAGCAAA CTGTAACACG TGTAGATAAT CAAAATCTTA CTCAAATTAT 10380 

TGAAAAATCT GTTACAATTG ACACCAATAT CGTATTAGTT GAAGGCTTTA AAAATGCTGA 10440 
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GAATGTTTGT 
GTTATTAAAT 
^ TGAAACAATT 
TAAATGAATA 
GCGTCAAAAC 

10 

CACAAATTGG 
GAATAGGGCC 
GTAAAGATGC 

IS 

TTTGGAAAAA 
ATGAAGAAGC 
AGATATATTA 
ATTTGAAGAT 
TGTAAATGAG 

25 AATTCCACCG 
TTCAGTGCGA 
TAGAGTAATT 

30 TGCGCAATTG 
TGATAAAGGT 
GTTTTTTGTC 
TCAGTTTTTA 
TGGfiCGTTTT 
AGCACTACAT 

40 

TTTGGATGTA 
TGATTTGGAC 
AAATAAAAGA 

45 

GTAACTTTAG 
TACCTAAAAA 
AATTAGGTGT 
ATGTACTTAT 



TATAGCATTA ATGTAAGGGA 
AAAATTAAAA ATGATTGTGA 
TGAAATCGTG ACAGAACCGA 
TCAAGGTGCA GTAGTTGTTT 
GGAATATTTA GAATATGAAG 
AGATGAAATA AATGAAAAAT 
ATTACAAATT TCAGATATCG 
CTATCGAGCA AATGAATATG 
AGAAATTTGG GAAGATGGTT 
AAAGAGGGAG GAATAAGAGA 
CAAAAAGCAC AGGAAGATAT 
TTATTGTTTG AACGTTATCC 
GAATTTGTAC AAAAATCGGA 
GTTAGTGGAG GTTAAGGGAG 
TTTGGTAAGC CCAAAGCTTT 
AAGACATTAG AATCAACAAA 
GCAACGCAAT TTAAATATCC 
CCATTAGCAG GAATTTATAC 
GTTTCTGTTG ATACACCAAT 
GTTTCTCATC TTATTGAAAA 
ATTCCAACAA TTGCATTTTA 
TCTGATAATT ACAGTTTTAA 
AGGGATGTAG ATGCGCCCTC 
GCTTTAATTC AAAAATTGTA 
TAAACTAGGA CGTCCCATCC 
GTGTGATTAT TGCATGCCTA 
TGAACTTTTA ACGTTTGATG 
AAAAAAAATA CGCATTACAG 
AGCTAAATTA AATCAAATCG 



GCATGAAGAT 
TACACAATTA 
TACAAACAGA 
TTACCGGTCA 
CGTATATTCC 
GGCCTGGAAC 
CTGTATTAAT 
CAATTGAGCG 
CAAAATGGCA 
OATGAAGGTA 
TGTGCTTGAA 
GCAAATCAAT 
TTTCATTCAA 
CATGAAAGCA 
TGCGGAAGTG 
TATGTTCAAT 
AAATGTTGTT 
AATCATGAAG 
GATTACTGGT 
TCATTTAGAT 
TAGTCCGAAT 
AAATGTATAT 
ATATTGGTAC 
AGCTGTTAGG 
GTGACTTACG 
AAGAGGTATT 
AAATGGCTAG 
GTGGAGAACC 
ATGGTATTGA 



TTTACAGCAT 
ACATAGAGGA 
ACAATATCGT 
TGTTCGCGAA 
AATGGCTGAA 
GATAACGAGT 
TGCGGTTTCT 
TATAAAAGAA 
AGGGCATCAA 
CTTTACTTCG 
CAAGCATTGA 
AATAAAAAGT 
CCTAATGATA 
ATAATTCTTG 
AACGGTGAGA 
GAAATTATTA 
ATAGATGATG 
CAACATCCTG 
AAAGCTGTAA 
GTCGCAGCTT 
GCATTAGGCG 
CATGAATTAT 
AAAAATATAA 
AGGTCCACAA 
GTTATCTGTG 
TGGAGATGAT 
AATCGCTAAG 
ATTGATGCGA 
AGATATTGGT 



TTGAGCAATG 
TTGAAATGAA 
GAATTCACTA 
TGGACTAAAG 
AAGAAATTGG 
ATTGTTCATA 
TCACCGCATC 
ATTGTTCCGA 
AAAGGGAATT 
CAGAAATTAA 
CTGTACAACA 
TTCAAGTTGC 
CTGTTGCATT 
CAGGTGGTCA 
CCTTTTATAG 
TTAGTACAAA 
AGAATCATAA 
AAGAAGAATT 
GCACGTTGTA 
TTAAAGAAGA 
CTATAACTAA 
CAA.CGGATTA 
ATTATCAGCA 
ATGGTAGAAC 
ACAGATCGGT 
TTCGTATTTT 
GTATATGCAG 
CGGGATTTAG 
TTGACTACAA 



10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC TATTTCAATC AATCAATAAT CGTAATATTA 12 360 

AAGCGACTAC GATTTTAGAA CAAATTGATT ACGCGACGTC TATTGGTTTG AATGTAAAAG 124 20 

TAAATGTTGT TATACAAAAA GGTATTAACG ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TTAAAGATAA ACATATAGAG ATTCGATTTA TAGAATTTAT GGATGTTGGT AATGATAATG 12 540 

GATGGGATTT CAGTAAAGTT GTAACTAAAG ATGAAATGCT TACAATGATA GAGCAGCACT 12 600 

TTGAAATCGA TCCTGTAGAA CCAAAATATT TTGGGGAAGT AGCAAAATAT TATCGCCATA 12660 

AGGATAATGG TGTTCAATTT GGTTTGATTA CAAGTGTTTC ACAATCATTT TGTTCTACAT 12720 

GTACACGCGC AAGGCTGTCA TCAGATGGGA AGTTTTACGG ATGTTTATTT GCAACTGTCG 12780 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12 840 

AATTTAAAGC TTTATGGCAA ATAAGAGATG ATCGATATTC AGATGAGAGA ACTGCTCAAA 12 900 

CAGTTGCCAA TCGTCAACGT AAAAAGATAA ACATGAATTA TATTGGTGGT TAATGTGTAG 12960 

GGACCACTAC ATATTAAATC ATTAGAGATG TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13020 

ACAATATTAT TTATTAAAGT AAAAACGGTC ATATCTATGC CAGATTTAAT AGAAATGATC 1308 0 

GTTTTTAAAG TTTTTACAAG TTGGCGGGGC CCCAACACAG AAGCTGACAG AAAGTCAGCT 1314 0 

TACAATAATG TGCAAGTTGG CGGGGCCCCA ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

GACAATGCAA GTTGGGGAAC GGGGCCCCAA CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

TAATGTGCAA GTTGGCGGGG CCCCAACATA GAGAATTTCA AAAGAAATTC TACAGACAAT 1332 0 

GCAAGTTGGG GATCAACGAA ATAAATTTTA TGAGAATATC ATTTCTATCC CACTCTTAAG 13 380 

AATCACTACA TAATAAATCT TTAGTGGTTC TTTAACATTG ATGTCACACT CCATGCCATT 13440 

GAGTTGTAAT ATATCTTTTT TAGGTATAAA TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13 500 

AGATATAAAT CTAAACAAGA TATAGCCAGC AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

AGTTTGATAT ATAATAAATT TAAGTAATTG TATAATAATA TGAATTACAA ACATCTAAGA 1362 0 

AGAAACATAG GAGGCATCAT ATTATGAGTA ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 1368 0 

GGGAGTTAAG TCAGTTAAAG CACTGGTTAA AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

TTGTAGTCCT TTTTAAAGTG TATGAAGCTG AAAAGATTAG CGGTAAAGAA TTGAGGGATm 138 0 0 

CATTACATTT TGAAATGCTA TGGGATACAA GTAAAATCGA TGTGATTATC CGTAAAaTCT 1386 0 

ATAAAAAAGA GCTTATTTCT AAATTGCGTT CTGAAACGGA TGAAAGACAA GTATTCTATT 13920 

TCTATAGTAC TTCTCAAAAG AAATTGTTAG ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

GCGTTACAAA CTAAAAACTT sAAAAgcaTG CCAATCTCTA TTCATCATAA TTGCGTCTTG 1404 0 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 14160 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14 220 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCT^ACGTAAA AAGTAGATGT 14 280 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14 34 0 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 14400 

GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 14460 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14 52 0 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14 580 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14 64 0 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 147 0 0 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14760 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGTAAT 14 8 20 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14 8 80 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 1494 0 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15000 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

TAATCGTTTA GGTCCrATTT sATTTACAAA TTTACCTGTA GCAAATCGA 15109 
<2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : doiible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 120 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 30 0 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 360 
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AAATTAACGC 


CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 


430 


GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 


540 


CATGATAAGT 


AATTAATGAG 


TAAAGCATAC 


CGGTTATACA 


ACAACATACA 


AGATGACACG 


600 


AAACAACCAA 


TGGCTCATGC 


TGTTGGTTGT 


TTTTTTAGGT 


GTGTCTGTCA 


TGGGCAACAC 


660 


TTTGACGTTG 


GAATTCCGTT 


ACAGGCTTGG 


GAGTAGAAAA 


TGTTAGCAAA 


AGGCAAGGGT 


720 


GTCTACAATG 


AATGATGAAG 


ATATTAAAAT 


ATAAGGATGA 


CTTTGTGAGT 


GGCGGATGGG 


780 


CGGTTGTCCG 


TCTGTAACAA 


TGGATGCGTG 


TGCATTATTA 


CAAAAATTCG 


ACTTTTGTAA 


840 


TAATATTTCA 


CATTTTCGAC 


ACTTTTTTGC 


TATAAAACT^ 


CCAATTGAGC 


GATAATAAAT 


900 


TCGCTTTTAA 


AAAATATGAG 


TTATCTATTT 


AGTTGCCAAA 


GATAAAATAA 


TAATGTTTAA 


960 


TAACATCATA 


TAGAGTATGT 


TAGTTTTAAA 


TGTCGAATAT 


ACGAATGTGc 


AAACAAAGTA 


1020 


ATCGGTAGAA 


ATTCAACATA 


CATAGCGCCG 


TTTACTGTTA 


AGTATTCACA 


TTACAGATGA 


1080 


AAAATATAAA 


ATTCTACATA 


ATCAAGACCA 


TGATGTGTAC 


TTGTTTAACT 


TATGACTCTA 


1140 


TTTGTTTAAC 


AATTGCGATA 


ATGGTCTTTT 


TATTTTATGC 


GTATCATTCG 


TCATATTTTT 


1200 


TATGAGGAAG 


GAGAAATGAT 


TATGTTAAGT 


ATTAAGCATT 


TAACGAAAAT 


TTATTCTGGT 


1260 


AATAAAAAGG 


CAGTAGATGA 


CATCTCTTTA 


GATATTCAAT 


CTGGGGAATT 


TATCGCATTT 


1320 


ATTGGAACCA 


GTGGAAGTGG 


CT^AAACGACT 


GCTTTAAGAA 


TGATAAACCG 


TATGATTGAA 


1380 


GCGACAGAAG 


GACAAATTGA 


AATTGATGGT 


AAAGATGTTC 


GGAGTATGAA 


TCCTGTCGAA 


1440 


TTGCGTAGAA 


ATATTGGCTA 


TGTTATTCAA 


CAAATTGGCT 


TAATGCCTCA 


TATGACGATT 


1500 


AAAGAGAATA 


TTGTGTTGGT 


ACCCAAATTG 


TTGAAATGGA 


CTAAAGAGGA 


AAAGGATAAA 


1560 


CGTGCAAAGG 


AATTAATTAA 


ACTTGTGGAT 


TTACCGGAGT 


CATTTTTAGA 


GCGTTATCCA 


1620 


GCAOAACTAT 


CAGGTGGGCA 


ACAACAACGT 


ATCGGTGTTG 


TAAGAGCACT 


TGCGGCCGAA 


1680 


CAAGATATTA 


TTTTAATGGA 


TGAACCTTTT 


GGTGCATTGG 


ATCCTATTAC 


GAGAGATACG 


1740 


TTACAAGATT 


TAGTTAAAAC 


GTTACAACGA 


AAATTAGGCA 


AGACGTTTAT 


CTTTGTAACA 


180O 


CATGATATGG 


ATGAAGCGAT 


TAAATTAGCA 


GACAAAATTT 


GTATTATGTC 


AGAAGGTAAG 


1860 


GTGGTGCAAT 


TTGATACX3CC 


AGACAATATT 


TTAAGACATC 


CCGCAAATGA 


TTTTGTACGT 


1920 


GATTTTATAG 


GACAAAATAG 


ACTGATTCAA 


GACCGTCCCA 


ATGACAAGAC 


TGTAGAAGGT 


1980 


GTAATGATTA 


AACCAATCAC 


GATACAAGCA 


GAAGCAACAC 


TGAATGACGC 


CGTTCATATT 


2040 


ATGAGACAAA 


AACGTGTTGA 


TACTATTTTT 


GTAGTAGATA 


GTAATAACCA 


TTTACTAGGT 


2100 


TTCTTAGACA 


TTGAAGATAT 


AAATCAGGGT 


ATACGTGGAC 


ACAAAA6TTT 


ACGAGACACC 


2160 
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ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 2 280 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG 234 0 

GATACAGTGC AAACAGAACA TGTGGGGGAA GACAcTGCGT CCTCAAAAGT GCATGAGCAA 24 0 0 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 2460 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2520 

CATTATTACT TGCCATCATT GTTGCAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2 580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG 2640 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

TTATTTATGT ATTATTACCT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 2760 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

TTGAATTGCC GTTAGCATTG CCXSCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 288 0 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT GATTTCATTT 2940 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3000 

CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG 3060 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 3120 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 3180 

AAGAGCACGA AAAATGATGT CAAAATTACA GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA 3 3 00 

GTAAATAATT TAGGGTCAAG TACGATTCAA CATAATGCCT TAATTAATGG GGATGCTAAT 3360 

ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGCACCAATT 342 0 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 34 80 

ACGtTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AGCATAGTAA AGATTTACGT 3600 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3660 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATGCAAA TAGGTCTAGT CTACGACGCA 3 720 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA GGTTATTCTA CAGATGGTCG AATTGCGGCG 3 780 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3 840 

GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAGTTG 3900 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CGCTTGAATT ATGAAGCGGA TGGTAAAGGT 3 960 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT TATTACAGCA ATTATTCAAT TATTATGTTA 4 080 

CGAACTTTGG TTATCTATGG GATTTATTTT TCAAACACTT ATTAATGTCT GTCTATGGTG 414 0 

^ TGCTGTTTGC AgCTTTAATT GGTATTCCAT TGGGAATCTT GCTTGCaAGA TACACAAAAC 420 0 

TTTCTGGATT TGTAATTACA ATTGCAAATA TAATTCAAAC AGTTCCAGTC ATTGCAATGT 4 26 0 

TAGCTATTTT AATGTTAGTC ATGGGCTTAG GTTCAGAAAC AGTAGTTTTA ACAGTGTTTT 4 320 

TATATGCGTT ACTTCCAATT ATAAAAAACA CTTATACTGG TATAGCTAGT GTTGATGCGA 4 3 80 

ATATTAAGGA TGCTGGCAAA GGTATGGGAA TGACACGCAA TCAAGTGCTA CGAATGATTG 44 40 

AATTACCGTT ATCTGTTTCG GTTATTATCG GTGGCATTCG TATTGCCTTG GTTGTTGCGA 4 500 

IS 

TAGGTGTTGT TGCCGTTGGA TCATTTATAG GAGCACCTAC GCTTGGTGAC ATTGTGATTC 4 560 

GTGGTACAAA TGCGACGGAT GGCACAACGT TTATTTTAGC AGGTGCGATT CCGATTGCTA 4 62 0 

TCATTGCAAT CGTCATTGAT GTACTATTAA GATTTTTAGA AAAACGATTA GACCCAACAA 468 0 

CACGACATCG TAAAAATCAA TCTAATCATC GGCCGCAAAG TATTAATATG TAATAGTAGA 474 0 

AGATGTTTAT AATTTAGCGA TTTCGTTTCA TGATTTATAA AAAATGAGGC TACTCAAGGA 4 80 0 

25 GCTCAAATAA TCTTTGAGTA GCCTTTTTAT AGGTTGTGTT TGTATGCGTT TACACTAAAA 4 86 0 

TAGCAATTAT TATCATGAAA GTTTTTGGAT AAAAAGCGTT AATTATTGTA AAAATACTAA 4 920 

AAAATGAGAT GTTTTATTTA TAATTTTCTG CAAATTTATG ATATTGTTTC TTAATATATC 4 9 80 

30 ATATTAAAAA TTTGTTTTTC TTAAACATAG GAGGCTTATC TAATTCATGG ACACATCAAA 5 04 0 

ACAATTTAGA GGTGACAACC GATTGCTTTT GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

GCTATTCGCG CAGTCACTTG TTAATCTTGT TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

CGTTGGAACG ATAAATATCG CTGTTAGCTT ATCTGCCTTA TTTGCTGGTT TGTTTATCGT 522 0 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 5280 

ATTAAATGTT GTAGGTTCAT TACTCATCAT CATTACACCT TTGCCAGCAT TTTTAATTAT 534 0 

40 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC ATGTATTATG CCATCAACAC TT6CTATTAT 5400 

TAACGAATAT TATATTGGTA CAAGAAGACA ACGTGCCTTA AGCTATTGGT CTATTGGTTC 5460 

TTGGGGTGGT AGTGGTATTT GTACGTTGTT TGGTGGCTTA ATGGCTACAT ATATAGGTTG 5520 

GCGTTCAATA TTTGTTGTTT CAATTCTATT AACATTATTA GCAATGTACT TAATCAAACA 5580 

TGCACCTGAG ACTAAAGCAG AACCAATCAA AGGTATGAAA GCAGAAGCTA AAAAGTTTGA 5640 

50 CGTTATTGGT TTAGTCATTT TAGTAGTGAC GATGTTAAGT TTAAATGTAA TCATCACACA 5700 

GACGTCTCAT TTTGGTTTAG TTTCACCGTT AATTCTAGGT TTAATTGTTG TGTTTATCTG 5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AA C TTCTTAT TAAATGGTGT 5880 

AGCAGGTGGT GCTVCTTATCG TTATTAACAC GTATTATCAA CAA.CAATTAG GATTTAATTC 5 94 0 

5 TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6 000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 612 0 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 618 0 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 624 0 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGGTTTATAC 63 00 

15 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA TGTTTAATGC 636 0 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 64 20 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT 6 54 0 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGCGAATGAT TTTCAAATTG ACGCTAATAT 66 00 

25 GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 6660 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6720 

ATATGATGTC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AGCAACATGG 67 8 0 

30 ATTAAATATA ACX3ATTAATG GAGAGGCATT CGAGTTAAAC ATTCCGATGT ATCATTTTAA 684 0 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6 90 0 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6 96 0 

35 GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7 020 

CAGASGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

40 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 7200 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 726 0 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCX3AAGTGT TTGATTTATA AATTAACGCA 73 2 0 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 744 0 

AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 750 0 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 756 0 

55 



373 



EP 0 786 519 A2 



CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 7680 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 774 0 

TTCAGATTTG TCATAGGcTA CGACATACTA TTAGTATTTA CTAGACAGTT TTTACGACGA 7 8 00 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 78 6 0 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG TGTTTATAAT GAGTAAAATT 7 92 0 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 798 0 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 804 0 

AATGTAAAAG CATATATTGG TGATATATTA AAAGCTGATA CTATTGATCA AGCGTTAGCA 8100 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 8160 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCGGC GAAAAAGCAT 822 0 

GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACCTGG CGAAGGATTA 828 0 

GCAAATGAGG AAACTTCACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 8340 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 8400 

GGCTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GCATGATTTA TAATCAATTT 8460 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 8520 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8 58 0 

GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 8 64 0 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG TAAGCAATGA GAAGTTTAAA 8 700 

GCGCAAGGTG GTACTCTGAT TTATCAAACT TGGAAAGATG GCATGAATCC AATTAAATAA 87 60 

TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 88 20 

TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 8 880 

ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 894 0 

TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 90 00 

ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT ACATAATGAT AACCAACTCA 90 60 

ATACAGCAAT CC 90 "^2 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT 60 

TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 24 0 

TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 360 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 4 20 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 4 80 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 54 0 

TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCQACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 6 60 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 780 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTC 84 0 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 102 0 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 114 0 

AAGACAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 1380 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 1440 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 
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AAGCGATTAT TCGATGTAGT GAGTTCAATA TATGGTTTAG TAGTTTTAAG TCCGATTCTG 1800 

TTAATTACAG CATTACTAAT TAAAATGGAa TCACCTGGAC CAGCCATTTT CAAACAAAAA 18 60 

^ AGACCGACGA TTAATAATGA ATTGTTTAAT ATTTATAAGT TTAGATCAAT GAAAATAGAC 192 0 

ACACCTAATG TTGCAACTGA TTTAATGGAT TCAACATCGT ATATAACAAA GACAGGGAAG 1980 

GTCATTCGTA AGACCTCTAT TGATGAATTG CCACAATTAT TGAATGTTTT AAAAGGAGAA 2 040 

10 

ATGTCAATTG TAGGTCCTAG ACCAGCGCTT TATAATCAAT ACGAATTAAT CGAAAAACGT 2100 

ACAAAAGCGA ACGTGCATAC GATTAGACCA GGTGTGACAG GACTAGCTCA AGTGATGGGG 2160 

AGAGATGATA TCACTGATGA TCAAAAAGTA GCGTATGATC ATTATTACTT AACACATCAA 2220 

15 

TCTATGATGC TTGATATGTA TATCATATAT AAAACAATTA AAAATATCGT TACTTCAGAA 2280 

GGTGTGCATC ACTAATGAGA AAAAATATTT TAATTACAGG CGTACATGGA TATATCGGTA 234 0 

ATGCTTTAAA AGATAAGCTT ATTGAACAAG GACATCAAGT AGATCAAATT AATGTTAGGA 24 00 

ATCAATTATG GAAGTCGACC TCGTTCAAAG ATTATGATGT TTTAATTCAT ACAGCAGCTT 24 60 

TGGTTCACAA CAATTCACCT CAAGCAAGGC TATCTGATTA TATGCAAGTG AATATGTTGC 2 520 

25 TGACGAAACA ATTGGCACAA AAGGCTAAAG CTGAAGACGT TAAACAATTT ATTTTTATGA 2580 

GTACTATGGC AGTTTATGGA AAAGAAGGTC ATGTTGGTAA ATCAGATCAA GTTGATACAC 2 64 0 

AAACACCAAT GAACCCTACG ACCAACTATG GTATTTCCAR. AAAGTTCGCT GAACAAGCAT 2 70 0 

30 TACAAGAATT GATTAGTGAT TCGTTTAAAG TAGCAATTGT GAGACCACCA ATGATTTATG 2 760 

GTGCACATTG CCCAGGAAAT TTCCAACGGT TAATGCAATT GTCAAAGCGA TTGCCAATCA 2 82 0 

TTCCCAATAT TAACAATCAG CGCAGTGCAT TATATATTAA ACATCTGACA GCATTTATTG 2 380 

ATCAATTAAT ATCATTAGAA GTGACAGGTG TGTACCATCC TCAAGATAGT TTTTACTTTG 294 0 

ATAC&TCGTC AGTAATGTAT GAAATACGTC GCCAATCACA TCGTAAAACG GTATTGATCA 3000 

ACATGCCTTC AATGCTAAAT AAGTATTTTA ATAAGTTGTC GGTCTTTAGA AAATTATTCG 3060 

40 

GCAATTTAAT ATACAGCAAT ACGTTATATG AAAATAATAA TGCACTTGAA ATTATTCCTG 3120 

GAAAAATGTC ACTTGTTATT GCGGACATCA TGGATGAAAC GACAACCAAA GATAAGGCAT 3180 

AAGTCATCTA TTAAATAAAA TCAACATACA AATCGTTTTA TTTGGAGGTT ATAGTATGAA 324 0 

45 

GTTAACAGTA GTTGGCTTAG GTTATATTGG TTTACCAACA TCAATTATGT TTGCAAAACA 3300 

TGGcGTCGAT GTGCTTGGTG TTGATATTAA TCAGCAAACG ATTGATAAGT TACAAAGTGG 33 60 

TCAAATTAGT ATTGAAGAAC CTGGATTACA AGAGGTTTAT GAAGAGGTAC TGTCATCGGG 34 20 

AAAATTGAAG GTATCTACAA CGCCAGATGC ATCTGATGTT TTTATCATTG CCGTTCCGAC 34 80 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 3 60 0 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3 66 0 

^ AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3720 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 3 7 80 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 3 84 0 

10 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3 900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3 96 0 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 4 020 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 4 080 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 414 0 

GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4 200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4 320 

25 AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 4 380 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 4 44 0 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AATTTTATCG ACAAATAAAA 4 50 0 

30 TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4 56 0 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4 62 0 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4680 

TTACAGCACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 4740 

ATCASkSATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC 4 800 

TTGCTAAACT TGATAGCATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4860 

40 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 920 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4980 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 5040 

45 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACGCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 5160 

so GCAAGAAAGT TGTTTTACTA ACAGCGCATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 5220 

AGATTTTTAA AGCAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 5280 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 54 00 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 54 6 0 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5 52 0 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5 580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 564 0 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACXTC CTAATCATGA AGTTGGTTTA GACAACCAGC 5760 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 5880 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 594 0 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 612 0 

CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 618 0 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 624 0 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TJUWiTTTTA ATTTTCTGTT GTAGCGTGTA 630 0 

GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 63 6 0 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 64 2 0 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 64 80 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT G54 0 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6 600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 6720 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 6840 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6 900 

ATAGCCAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA CAAGAGCAAA AGATAAAGAT 7080 
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TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 72 00 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 7260 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 732 0 

AC7VGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7 3 80 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 744 0 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

TTTACOGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 7680 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 774 0 

GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 804 0 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 828 0 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 834 0 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 8400 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 846 0 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8520 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 864 0 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 8700 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 8760 

CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8820 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8880 
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GATGATGTAT AAATCATGGT TAATTACGGA AGCATTAATA TTAACCTGAG AAGCTATAAA 9000 

GAATTATTTT TAAAAGCGAC AATATTAAAT ACGACGCATT TATTTAGGAG TGGCAAACGT 9060 

^ ATGAATGGGA AAAAGGCGAA TACGATAAAC AGATACAAAT ATTTTCATCA TGTCAATCAT 9120 

CAAAAAATTC AACAAAGTTC TAAAAAGACG CTGTGGGCAT CACTAATCAT CACATTGTTA 9180 

TTTACAGTGA TTGAATTTGT CGGAGGTTTA GTATCTAATC CATTGGCATT ACTGTCAGAT 9240 

TCATTTCATA TGCTTAGTGA TCJTATTAGCA CTTGGTTTAT CTATGTTGGC CATTTATTTT 93 00 

GCAAGTAAAA AGCCGACTGC ACGATACACA TTTGGATATT TAAGATTTGA GATATTAGCT 9360 

GCATTTTTAA ATGGTTTAGC ATTAATTGTA ATTTCAATCT GGATTTTATA TGAAGCTATT 9420 

15 

GTACGTATTA TTTATCCGCA ACCAATTGAA AGTGGCATTA TGTTTATGAT TGCTAGTATT 9480 

GGTTTACTCG TCAATATTAT TTTGACTGTT ATCCTTGTAA GGTCTTTAAA ACAAGAAGAC 9540 

AATATCAATA TTCAAAGTGC ATTATGGCAT TTCATOGGAG ACTTATTGAA CTCTATTGGT 9600 

GTCATCGTTG CAGTTGTATT GATTTACTTT ACAGGATGGC GCATCATCGA CCCAATCATT 9660 

AGTATTGTAA TTTCACTCAT CATTTTACX3T GGTGGTTATA AAATTACGCG TAATGCgTGG 9720 

25 tTAATTTTAA TGGAAAGTGT GCCTCAACAT TTGGATACTG ATCAAATTAT GGCAGATATT 9780 

AAAAACATAG ATGGCATATT AGATGTACAT GAATTTCATT TGTGGAGTAT TACAACAGAG 9840 

CATTATTCAT TAAGTGCCCA TGTTGTGTTA GATAAAAAAT ATGAGGGTGA TGATTATCAA 9900 

30 GCGATTGATC AAGTATCATC ATTGTTGAAA GAAAAATATG GCATTGCACA TTCAACGTTG 9960 

CAAATTGAAA ACTTGCAATT GAATCCATTA GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

ATAAAACATT GTAGCGCCTA AAACATTAAT CTATGTCATA GGCGCACGTT TCGTTTTATA 10080 

CTTATGTTGC ATCATTTAAA TGATTTTCGT CAATTTCTTT GATGCTATCT ACATCTAACA 10140 

CGACATCTTT AGGTTTCAAA ATATGAATAT GTTTTTCATC ATTTGTATGT AAAATGCGTT 10200 

CTATGATGTA CCTTTGACCG GCCATTGTTT CTACAGCAAT CTTTTTGTTT CTAGCTAAAC 10260 

^° TTGCTACGAC AGATTCTTTA TCCATAATGA TAGCCCCCTA TATATATGTT TATTTACTTA 10320 

TACCCTAACA TGATTTTTAT ACTCTTTGAA AATATATTTT ACAGAATTTT ATCTAAATAT 10380 

TTAAAAAAAT ATCTTAATAT CCTTGTAATC CGATAAGAAT TATAGTAATA TTTTTTCAAC 10440 

45 

CATtGTTATA GGAGGTCTTA TTAATGACAT TATTTTTATT AGAAGCTAAC AATCTTGATT 10500 

TTGCATCAAC GAAAGAAGAA CTAGAAGCAA AGGCAGCATC ACTATCTACG AAGACAATTC 10560 

CAACATTAAT TGAAGTACAA GCTACTGAAA ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

CAAATGACGA aGCAGAAGCT AAACAATTTT TAACAGAAGC AGATATTAGT ATTCAATTAG 10680 



55 



380 



EP 0 786 519 A2 



TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 1080 0 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 1086 0 

5 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

10 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

20 ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 11400 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 11460 

TGGATTTATT CATCCATCGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 1164 0 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 117 00 

^° GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAACGTTA TATTGATGGA 11760 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAARCGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 11880 

TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

40 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 12240 

45 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 12600 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 126 60 

CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 12840 

GTGTAGACAT TATGACGCAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TGTATAAGGA GGCATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 1314 0 

ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 13200 

TGGAGAAATT TTCCAACATT TAGCAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13260 

CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 1332 0 

CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT 13380 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 13440 

TTTTTTCCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 13500 

AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 13560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

TTTAGTTTCT GGTGAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 13680 

ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAGCAATA TTCTTTATCG GATTATTTGG 13 740 

TTTXRTTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 138 00 

ATASGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 138 60 

AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 13920 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13980 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14 040 

AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT ACCGGATTGT CTAATCCGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 14160 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 14 280 
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TTTAGGAGTC AACGGGTCAG CAACGTATCA AATCACATTG AATCAAGTCG TAGTGCCACA 144 00 

ATCACAAATT ATCACGCATG ATGCGAAGCA GTTTGCGGCA ACTATTCGCC CGCAATTTAT 1446 0 

5 

TGCTTACCAA ATTCCAATAG GATTAGGCTC AATTAAAAGT TCTTTAGAGT TAATTGATGC 14 520 

ATTTTCAAAT GTGCAAAACG GAATAAATCA ATATTTAGAG TATGATGTTG AAGCTTTTAA 14 580 

AAAACGTTAT CGTCAACTTA GAGAGGAATA TTATGCAATA TTAGATGACG GTAACTTAAC 14640 

TTCACATTTA AATGAATTAA TATCATTGAA GAAGGACATC GGCTATTTAT TGTTAGATGT 14700 

AAATCAAGCT TCTGTTGTCA ATGGTGGTTC TAGAGCGTAC ACACCATATT CGCCACAAGT 14760 

TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC AGCATTGACA CCGACATTAA GACATTTAGG 14 820 

15 

TAAACTTGAA GCAGAGTTGA AGGGGTAAGT GTGATAAGCT GATTTTTTGT TTAGATGCGT 14880 

TTGTTGAAAC ATTTTTTAAA ATAATATAAA TCTTAGTTTA TAAACATTTT CTGTTAATTT 1494 0 

GTTATATCCT TTTAACTAGG AAAATATACA TTTCGTAATA ATAATAATCG TTATCATTGA 15000 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT GTGAACAATT AATGAACTTC TTATTTTAAA 15060 

GAAGGTGAAT ACTATAGATA CGCATACTAA AGAACAACAA TTCTCGAATC TAGTAAGATC 15120 

25 TTATCGTAAA GAATACGTGG GTAAAGGACC CAATAGTATT CGAGTGTCGT TTAAAGATAA 15180 

TTGGGCGATT GCACATATGA CAGGTGTTTT GAGTAAAGTT GAGAGTTTTT ACCTAAACGA 15240 

CAAACGCAAT GAATCGATGC TCCATTATAC ACGCACAGAG AAGATTAAAC AGATGTATAA 15300 

30 AGAAATAGAT GTAAATGAGA TGGAAAGTCT TGTAGGCGCT AAGTTTGTAA AATTATTTAC 15360 

AGATATTGAT TTGAATGATG ATGAAGTCAT TTCAATATTT GTTTTCGATA AGTCAATAGA 15420 

ATAAGTGTTG CTGGTGTAAG GTACACGGTG CTGTTTGCTA ACTTCGCTTT GAATTTAACA 15480 

ATAATTCAAG GGGGTGGTAT GTCAAACGGT GCCGTTTTTT TGTCATATTT TTAAAACAAG 1554 0 

CAACATGCAA CACGTACTTT AAGGAAGTCA AAATTTATCA TTTAGGAGAG ATGGATATGA 156 00 

AAAtCGTAGC ATTATTTCCA GAAGCAGTAG AAGGTCAAGA AAATCAATTA CTTAATACTA 15660 

40 

AAAAAGCATT AGGATTAAAA ACATTTTTAG AGGAAAGAGG ACATGAGTTC ATTATATTAG 15720 

CAGATAATGG TGAAGACTTA GATAAACATT TACCAGATAT GGATGTGATT ATTAGTGCGC 15780 

CATTTTATCC TGCATATATG ACTCGTGAAC GTATTGAAAA AGCACCGAAC TTGAAATTAG 1584 0 

45 

CAATTACAGC AGGTGTAGGA TCTGACCATG TAGATTTAGC GGCAGCAAGT GAACACAATA 15900 

TTGGTGTCGT TGAAGTTACA GGAAGTAATA CAGTTAGTGT GGCAGAACAT GCGGTTATGG 15960 

ATTTATTAAT ACTTCTTAGA AACTATGAAG AAGGTCATCG TCAATCAGTA GAAGGTGAAT 16020 

GGAACTTGTC TCAAGTAGGT AATCATGCGC ATGAATTACA ACACAAAACA ATTGGTATTT 16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 16380 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 1644 0 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16800 

TGAnAAATnT CATTCATGTG GnAATC 16826 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TAT»yVACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 180 

AGCTTAGCTA tnCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 24 0 

ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 3 00 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 3 60 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 4 20 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 4 80 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA itiGGCGGAGTC GCTTTTGCTA 54 0 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 78 0 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 84 0 

^ TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 90 0 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 96 0 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

CTTCACCATA TTCAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 108 0 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 114 0 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

15 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

ATATCAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 1380 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 1440 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 156 0 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 1680 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 1740 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 1800 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1360 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACGCTATGT 1920 

TATATTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 204 0 

GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 210 0 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACGCCGAAA TCGTTCAAAA 2220 

45 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 2340 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

SO 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG AAAGTAAATC AAGAATTGAA TGAACTTCAT 2460 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 2580 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2820 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 2880 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 294 0 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

15 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 3240 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGATCATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCX3ATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 3480 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3540 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 372 0 

CAACSaGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 3780 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AATATTCCAT TAACGTGGCA 3840 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACX3 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

45 

{2) INFORMATION FOR SEQ ID NO : 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
^° (C> STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 6 0 

5 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 12 0 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 24 0 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 3 00 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 4 20 

IS 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 4 80 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 54 0 

TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 660 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 72 0 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 7 80 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 840 

AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 96 0 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 108 0 

AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 114 0 

TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 1260 

40 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 1320 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 80 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 144 0 

4S 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 1680 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 18 0 0 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 186 0 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 192 0 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 19 80 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2340 

CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 24 00 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 2460 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2 8 80 

TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AtCTAGTTAG ACAGCACTTT 2 94 0 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3 000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3 060 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 324 0 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 36 00 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3 720 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3 780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3 84 0 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 900 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4020 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 40 80 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 414 0 

GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4380 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 444 0 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4500 

ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4 680 

TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4 74 0 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4 8 60 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4 920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4 980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 5040 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 54 00 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 5580 

,0 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 564 0 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 5700 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 58 20 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

25 TAATAAAGCG AAGAAAGTCXS TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 6300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 63 60 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CX3GACGTGCA TTACCTTTAG CTCAAGGTGT 6480 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCX3GGA GGAGATGGTG ATGGTTATGC 6540 

^ TATASGTATG GGGCATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG cAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 
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TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 72 0 0 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7 260 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7320 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 7 3 80 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 74 4 0 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7S0O 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7560 

TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7 620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 7680 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 7740 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7778 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
< D ) TOPOLOGY : 1 i near 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 60 

TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATSTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 240 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 300 

CCATAATAGC CTGGAATGAT ATTCATATCA TTTAACCATT TGATAAAACG AGATGAAGTC 360 

AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 420 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 30 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 540 

GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 600 

GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 660 

GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 84 0 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO : 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 18 0 

AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 24 0 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 3 00 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 3 60 

TTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 600 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 720 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 
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TCTATCAATA ATGCATCATT TTGGACGTTG 
ATAATTGGTT GTACTAATTT AGACGTAGGT 

5 

ACATACTTTT CTTTCTCAAT ATCATTTTTC 

TTAAGCATTA TCGCACATCT CGTTGTATAT 

GGGATAAAAA AATAACAGCA TCTTAACAAA 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA 

GAAATGATAA TTTGTTTGTT CTTCTGATCT 

'5 ACATTCGTAT TATTGTTACG ATCAATCACA 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT 

CGATCTTGTA ACAOGTTGTC TGCTTTTAAT 

TTAACGATAC CAGCAAGGTC AATTGCTGCT 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG 

TTAATCAATT GCTCTTCACC TGGAATGTTT 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG 

GTCTTTTCGA TATCTGTAGC ACGTATACCT 
TGCGCTGCTA AGTTAGAAGA AAATTCAGAA 

^ TCAATACCAG CAGTATCATT AACTTGGCCA 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA 
AAGACATCAT AAGGTTCTTT ATTTTCAAAC 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA 
GCAGGGACTG CCATGATGTT TTCAGATTCT 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT 

TTTTGACACG TTAAACTAAT GAATGTTTCA 
TGGTCAATGA TTGACTGTTT TTCTTTAGGT 

^° ACAAGTGAGT TAAACTCGTG ACCTAATGGA 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA 
CTAGGTGACA TATCAGTAAT TTCTGTCAAC 

55 



TTAAGGATAG CTTTATCTAT AAATAACTGC 114 0 

ATCGTACGTA AAAGCATAAT AATTTCGTTC 12 00 

ATATTGATTT GTTTGCGAGA GGTACATACT 126 0 

ATTAAGTTTA TCATAACATG ATTTTATGTC 1320 

TGTAAGATAC TGTCAGTGAA ATGAATGAAA 13 80 

TTTAATGCTG CATTTGCACC AGCGCCCATT 144 0 

GTGACATCGC CAGCAGCAAA TATTCCAGGA 1500 

ATTTCACCAC GTTCGTTTAA TTCAACAGCA 15 60 

CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

TTGATATCAA CATTTGATAA AGAACGTAAA 174 0 

TCGCTAGCGA ATTCGAATAA TGTAACATGA 180 0 

TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

AGCTTACGCC AACCTGCACC AGTAGCAATA 19 8 0 

TTTTCTAACG TAACTTTAAT TGCTTCGTCA 204 0 

GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

CCGATACX3AT CAGCAACTAT ACCAGTACGT 2220 

CTACCACTAG CAGGACCACC ACCAACGATT 2280 

TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

TCACGGAACA CTGCACCATC AATCATAGAA 2460 

AAGTTAAGTG CTTGAACGAC ATCAGGACAT 252 0 

AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

GCACGACCAC TAACCTGTAA AATTGCTAAA 264 0 

ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 
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TGTTGTTTTA 


AATCAGCATT 


AAGCATGGTT 


GTAATGCCTC 


CTTAGATTTT ACCTACTAAA 


2940 


TCTAAACCAG 


GTTGCAATGT 


TTTAGCGCCT 


TCTTCCCATT 


TAGCTGGGCA TACTTCGCCA 


3000 


GGGTTTTTAC 


GAACATATTG AGCTGCTTTG 


ATTTTGTGAG 


CTAATGTACT AGCGTCACGG 


3060 


CCAATTCCGT 


CAGCGTTAAT 


TTCAGATGCT 


TGTACAACAC 


CGTCTGGGTC GATAATGAAT 


3120 


GTACCACX5TT 


GAGCTAAACC 


AGTAGCTTCA 


TCTAATACAT 


CAAAATTACG AGTGATTGTT 


3180 


TGTGATGGGT 


CACCAATCAT 


AGTGTAAGTG 


ATTTTGCTAA 


TTGCATCTGA ATGGTCATGC 


3240 


CATGCnTGT 


GTACGAAGTG 


AGTATCAGTT 


GATACTGAGA 


ATACATTTAC GCCTAATTTT 


3300 


TGTAATTCTT 


CATATTGGTT 


TTGTAAGTCT 


TCTAATTCAG 


TTGGACAAAC GAATGAGAAG 


3360 


TCAGCAGGAT 


AGAAGCATAC 


TACGCTCCAA 


GAACCTTTTA 


AATCTTCTTG TGTAACTTCT 


3420 


TTAAATTGAT 


CTTTTTTTGG 


ATCGAAArCT 


TGCGCTGTAA 


ATGGTAAGAT TTCTTTGTTA 


3480 


ATTAATGACA 


TAAATATCTT 


CCTCCTAAGA 


ATTTAAGTAT 


GAATTAGAAC TATCAATTGA 


3540 


TTGCGCTTAA 


TTATAATAAT 


TCTAATCTCT 


TAGTTAGCAT 


TATTACATTT TGATCCAGAA 


3600 


TAGTCAACTG 


GATAACTTTG 


TAAAGTGAAT 


GATTACTTTT 


AAAATAAAGA AAGATAATAT 


3660 


AAAGTGCTTT 


GATAATGGAT 


TTTGTAGTTG 


ATGATTTAAA 


AGGTTGTGTC TATATTTAAT 


3720 


ATCTTGATTT 


TAATGTAAAA 


AATGTAAAAA 


AAGAAGATTT 


GTATTCTCAA CTAAGTCAAC 


3790 


CTTATTGATA 


ATGGTATGAG 


AATATTTGTT 


CGAGATGGAT 


GAAGGTAATG AGTGAGAAAC 


3 840 


TGGATTTTTA 


AAGTATGAGA 


CAATATTTTA 


AAAAGTTCAA 


TTATTAACTT ATAAGCAAAT 


3900 


AATTGCTATA 


AAAAAGTTTG 


GACGTGTACA 


ATTGCAATAT 


GAAGATTTTA AATTAATTGT 


3960 


AAAGTATCGA 


GGAGTGGGTA ACGTGTCAGA 


ACATGTATAT 


AATCTTGTGA AAAAGCATCA 


4020 


TTCTGTTAGA AAATTTAAGA 


ATAAACCTTT 


AAGTGAAGAC 


GTTGTTAAGA AATTGGTAGA 


4080 


AGCTGGACAA 


AGCGCTTCGA 


CGTCAAGTTT 


CCTGCAAGCA 


TACrCAATTA TTGGTATCGA 


4140 


CGAlXSAGAAG 


ATTAAAGAAA 


ATTTACGAGA 


AGTTTCTGGA 


CAACCTTATG TTGTAGAAAA 


4200 


TGGCTATTTA 


TTCGTCTTTG 


TTATTGATTA 


TTATCGTCAT 


CATTTAGTTG ATCAACATGC 


4260 


TGAAACTGAT 


ATGGAAAATG 


CATATGGTTC 


AACGGAAGGT 


TTGCTAGTAG GTGCAATCGA 


4320 


TGCAGCATTA 


GTTGCCGAAA ATATTGCGGT 


AACTGCTGAA 


GATATGGGGT ATGGCATTGT 


4330 


CTTTTTAGGA 


TCATTAAGAA 


ATGATGTTGA 


ACGCGTTCGA 


GAAATTTTAG ACTTACCTGA 


4440 


CTATGTCTTC 


CCGGTATTTG 


GTATGGCAGT 


AGGGGAACCC 


GCAGATGACG AAAATGGTGC 


4500 


AGCCAAGCCA CGCTTACCAT TTGACCATGT CTTCCATCAT AATAAGTATC ATGCTGATAA 


4560 


GGAAACACAG 


TATGCACAAA 


TGGCAGATTA 


CGACCAGACA 


ATCAGCGAGT ACTATGATCA 


4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4 74 0 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4 300 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCT^GCA TTATCATTTG AATCGAAAGT 4 86 0 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4 92 0 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 980 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 504 0 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 5220 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 52 80 

GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 53 4 0 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 54 00 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 5460 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACX3TAAGA CGAACGTCAC 5520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 5640 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 5700 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 5760 

CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 582 0 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 5880 

ATGASACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 594 0 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 6240 

GACTAGTATG CT 6252 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 60 

,0 GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 18 0 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 24 0 

ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 300 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 4 20 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 540 

^5 ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC GOO 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 660 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 72 0 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGtAGCTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 84 0 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

35 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACTTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 12 6 0 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 132 0 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 1380 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 1560 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 16 80 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 1740 

5 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 1800 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

,0 AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 1920 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTGCGA 204 0 

15 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 216 0 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 222 0 

20 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 22 80 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 2340 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 2400 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 2460 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2 580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 264 0 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 27 00 

GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2 820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2 880 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2 940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

^ GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3 060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3 300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 

55 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 80 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 354 0 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 372 0 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3 840 

TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3 9 00 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3 960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4020 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4 080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 4140 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGOSGACG 4200 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4 320 

ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 43 80 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 444 0 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4 500 

ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4560 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4620 

ATGCAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 4 680 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4 800 

TGAcTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4860 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4 920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 9 80 

GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TCCATATATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 5280 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 5400 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 5460 

AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 5640 

AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 57 00 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGOTTGGTC 594 0 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 62 4 0 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6 3 00 

GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6 360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAAXCTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 82 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

^ AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyC 180 

W TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 24 0 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 3 00 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 3S0 

15 TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 4 20 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 480 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 540 

20 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 600 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 660 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 72 0 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAGCAAATAA AGGGATCTTA GATACTGAGC 840 

30 AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 9 60 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 102 0 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGAACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGtGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

40 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

45 TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 1440 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

SO TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 

55 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 174 0 

ATAATACGGT TGTATTATCX; TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1300 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 192 0 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2040 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTITTCTA ACATATGTTT 2100 

GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT OGAGCGTGTT CAACAACTGc 2160 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 22 80 

ATCATCACCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 234 0 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 24 00 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 252 0 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 276 0 

TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2 820 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTGCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 2 94 0 

AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3000 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3060 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 312 0 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3240 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3300 

GGAACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 3 54 0 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 36 0 0 

^ ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

GCaGCACGTC AAGGTTATGG CGAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 3780 

to 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT 3840 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3900 

,5 ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3 960 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4 020 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 40 8 0 

^0 GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 414 0 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG TATTAGAGGA 420 0 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 426 0 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4 320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 4380 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAGCATTGTT ATATGCACCG 4440 

30 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4560 

35 AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 474 0 

40 TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 486 0 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 492 0 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4980 

ATTAATCCAA CAACAGTATT GTAGATGACA AGTATCATAA TGACAGACAT AATAATACCA 5040 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

50 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 522 0 
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CCAGGTGATA 


ATGATTTCTG 


CTTATGAATC 


TGAGCATCAT 


TATTAGCGGC 


AGTAAAATCA 


AGATGACTTG 


TTGTGAAATA 


GTAGACCGCA 


ATCATAATGA 


CAATCGCAAT 


TAAAAATGGG 


GTAACACCGC 


CAAGCACAGC 


AATTAAACGA 


TCGAATTTTA GAAACAGTGT 


TGCTAAAATA 


AAGGCGACTA 


ATATGAGTGC 


GCTCAGCCAA 


TACGGTAAGT 


TGAAACTTTG 


ATGAATGGTT 


GACGCACCAC 


CTGCAGTCAT 


AATAATAGCT 


AAAGACAACA 


TAAACATTGT 


TAAAATAATA 


TCAAAACCTC 


TTGCAATAGA 


GGGGTATAAG 


AAATAGTTAA 


TTGAATCAGA 


ATGATTTCTG 


GACTTTAGAT 


GATGACCTGT 


ATGCATGACA 


ACCATTCCAC 


CTAAAGTAAT 


CAATAGTCCT 


GTTACAATAA 


TGCCTGAAAT 


GCTATATGCG 


CCATGACTTG 


TGAAAAACTG 


GAAAATTTCT 


TGACCAGTAG 


CAAAGCCGGC 


ACCAACGACA 


ACACCAACAA 


AGGCAAATGC 


CACAATAATG 


GACTCTTTTA 


AGATACGCAT 


GATTTAAAAA 


TGTCCCTTCG 


TAATTTTAAG 


TAATATAGAA 


AATGTAACAT 


ACATGTTAAT 


GAAAAATATA 


GTACTAATAT 


AGTATTTTGT 


TAAATTGGAG 


TAGAAGCGAG 


GGTGTCGGTC 


ATTTCATTAA 


TTTATTAGTT 


GATTTTGCAT 


TTTTTTGCTG 


TAAAGTTGTT 


ATAATACAGT 


TAACAGGAAT 


TAGCATAGAT 


ACACCAATCC 


CCTCACTACT 


CGCAATAGTG 


AGGGGATTTT 


TTTCGGTGTA 


GCTAGGTCGC 


CTATTTATCA 


TCGTGTTTGC 


GTAgCaATGC 


GTAAACACAG 


TACCACTAAA 


TAAGTGCACG 


ATACATGCAT 


CAAATGTCGT 


CTTTAGTcTA 


AGTAACGATC 


ATGCATTAAC 


ATTTTCAAAA TATCTATTTG 


AGCTTGAAGA 


TCTTTACCAA 


TATTGGTATC 


ACGAATCTTC 


TTACGTTGTA 


ATTCTTTATC 


TACGACGCGC 


TTTATAGAAA 


GTTCATCGAT 


ACCTTCGGAA 


AGTATTTTTn 


CTTTAGCGTT 


AAATTGTTGG 


TGTGCAACGA 


GTTGCATACC 


GAATGAATTA 


TACAATAGTG 


TATAGCCTGC 


AATGCCAGTn 


GTTGACTGAT 


AAGCTTTTGA 


AAAGCCACCA 


TCAATGACAA 


GCATCTTTCC 


ATCAGCCTTG 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pair 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 



5340 
5400 
5460 
5520 
5530 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6482 
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AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 24 0 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 3 00 

5 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 420 

ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 80 

10 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 540 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 600 

15 GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 660 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 720 

TTAAAAGGTA ATCGACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 7 80 

^° GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 84 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 1140 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 12 00 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

35 CAATATGGOG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 144 0 

CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 156 0 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 162 0 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 1680 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 1740 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

so 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 

55 



404 



EP 0 786 519 A2 



TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 204 0 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

^ AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 216 0 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 222 0 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 2280 

to 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 2 340 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGGCGTAT 2400 

,5 CACCTATTCA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 2460 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2 52 0 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 2 640 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2 700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2760 

AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2 820 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2940 

30 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 3000 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTACJTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 324 0 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 330 0 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 3 360 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

''^ AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 80 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 3 540 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 3 600 

50 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 36 60 

ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3 84 0 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3 900 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4 02 0 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4 0 90 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 414 0 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 42 0 0 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAGCATCAT ATTCCCATGT 4260 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 4 380 

ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 4440 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4 560 

AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTC CAGGTAAATC 4 620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4S80 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 4 74 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4 800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4920 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4 980 

AGCATCTAAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 504 0 

TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 51 00 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 522 0 

ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 5400 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 



406 



EP0 786 519 A2 



AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 564 0 

AATTGTCGCT TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 576 0 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 582 0 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 5880 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 5940 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 624 0 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 6 3 00 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6360 

ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6420 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 6480 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA TATTATTCTT CAATCCATTG 6540 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG TAACTATCAT AGTAATTAAT 6660 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CAACATCACG 6720 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAATCCTGGA TTATTATCGT AAGCATCTTT GAT C TTTTGT AAGAATTGTG CACGGATAAT 684 0 

GCAA'CCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6 960 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC 7020 

ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 7140 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7200 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7260 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA GACATGCCTA ATAATTCTTT 7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 744 0 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 750 0 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7 56 0 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7 620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 76 80 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 774 0 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7800 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 7980 

CATACACTAC ACTAAATCAT TTCX3AATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8040 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 822 0 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 82 80 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT 8340 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 84 00 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8640 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8820 

TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 8880 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 894 0 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 90 0 0 

CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 90 60 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 
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GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 924 0 

TCATATTATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 93 00 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 93 60 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 94 20 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 9780 

ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 9840 

TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 1002 0 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 10140 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 102 00 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 10320 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 103 80 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 



409 



EP 0 786 519 A2 



AGAATTGATC ATAACTAGTG TTGTACCATC TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AGTAGTTAAT TGCTCATATC CCGCAGATTC AATTTCATTC CTTGCTTGTT CTACAACACC 11100 

5 

GTTCATGTAT AAATCGAAA7 TCATGnCCAT AAGTTCAATC ACCTATCCCT TTATATTTAA 11160 

ACTACCCTCA TTCTACTAAT TAATAACATA TTGTTCAATA AACTAATCTG AATCACACCT 11220 

ATATTTAGAC ACAATTTTAA CAATATACCA AACATTATTG TGCTTAAAAT CATGGTAACT 11280 

JO 

AATTTGTTCA CATGTTTTCA TTAATATGTT TCAAGTATGA TGTCTTATTT TGACTTTACT 11340 

GCAAAAATGC ATTCAACCAT GTTGATTATT GTTCTTTATC TTTTTTGAAT ATATTGCACA 11400 

75 TATTTTAGTG CCAAAAAATA ATACATCCAT CGACAAGAAC AAGATAAAAC AAGTTGTCGA 11460 

TAGATGCATC TATGTTATCA CTAATATATA TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

CGCTGTTTAA TATGATTCAT ArATTTACCT GTTTGTAAAC CATCTAAAAT ACGATGATCA 11580 

20 ATTGAAATAC ATAAATTAAC CATGTTACGA ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

TTTTTAACGA TTGATTCTAC TTGTAAAATC GCTGCTTGTG GATGATTTAT AATACCCATT 1170O 

GATGATACTG AACCAAATGT ACCAGTATTA TTTACCGTAA ATGTACCGCC CTGCATATCT 11760 

25 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC GTTGCTAAAG TATTAATTTC TCTAGCTATA 11820 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA ATCACAGGTA CGTATAATTT ATTTTCATCA 11880 

GCAACAGCAA TTGAAATATT AATGTCTTTA TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

30 

CTATTTAATA AAGGATATGC TTTTAAAGCA TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12 000 

AACGTTAGAT TATATCCTTC TTTATTTTTA AAGCTGTTTT TATAATGATT TCTCGTATTC 1206 0 

35 ACAAGATTTG TAGCATCTAC TTCAATCATC ATCCATGCAT GTGGAATCTC TGTTACACTA 12120 

TTAACCATAT TTTGCGCAAT TGCTTTACGC ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

CTATTOTCTT CAGATGATTG GTTACTTGAT GTATCTACTG ATGTTGATTT TGTTTGAACT 1224 0 

40 TGTTTGTCAG ATTGAGCTGT GGTACCACCA TTTTCAATAA CTGACATTAT ATCCTTCTTA 12300 

GTTACACGAC CTTCAAATCC ACTACCTACA ACTTGTGATA AATCAATGTC ATGCTCTGAA 12 360 

GCGAGTTTAA ATACAACAGG TGAAAAGCGA CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

GTAGATGTCT GTTCCACTGT TGCACTAGCT TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

ACTTTTGCTT GTATCTCTTC AGTTGTTTCA TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CAGATAATTG TATCAATAGC TACTGTCTGC CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

50 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC ACTTTATCTG TAATAACTTC ACATAATGGT 12660 

TCATATTCAT CAATATGATC ACCAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12 840 

GGAGAAAATG GCATAGATGG TACAtCTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12 900 

5 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACCTTCTAAA 12 960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTATGTTTAG CACGATCAAT AATTGTTTCT 13 020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13 080 

10 

AAAATATCCG CTGCTTGTAA ACAATAATTG ACCATTAATC CATAACAAAA TACTGTTAAA 1314 0 

TCTTCACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 132 00 

,5 ACTTCTTCCT TTAAGAAACG ATAAGCTTTT TTATGCTCAA AGTACAATAC TGGATCATTT 13260 

GATTCGATAG ATGATAATAA AAGCCCTTTA GCATCATACG GTGTGGAAGG AATAACAATT 13320 

GTTAAACCTG GCGATGAAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCCTCCG 13380 

20 TGAACACCGC CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 1344 0 

CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13 500 

GCAAATTGAA TTTCTGCAAT TGGTCTTTTA CCTACCATAG CTGCACCAAT GGCAGTTCCA 13560 

ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCACCATA TTTTTGTTGC 13620 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 136 80 

ACATCTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGCCTCT AAATAAGATA 13740 

30 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC GTACACAAAT GCATAGGCTT CTTCGACACT 13 800 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13 920 

35 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC GATATTGGTC 13 980 

GTCATCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 1404 0 

40 TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 14100 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG ATAATTTTTC 14160 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14220 

TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT 14280 

TGAGCTACCT TCACCAACAG TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14340 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 14400 

SO 

ATTCTTAGCT CTACTACTAA AGTGTGATGG CATrTGTTTT CCACCAGAGT TAACATCGTC 144 60 

TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG ATACCCATAT AAGTAACGAA 14 520 
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AATCTGAGTT 
CAATAACCAC 
TAGGTCTTCT 
TACGTGAATA 
AGGATGTGCG 
TGATGCCTCA 
TTCAGTTGAT 
CACTGCTTTA 
CTTTGCTTGT 
AGGCATCATG 
AACACCTTCT 
ATAAATATGT 
AAGtTTTATT 
TAGCAACACT 
GTTAACATTT 
TGACAATGAT 
ACCTGCTTCT 
TCCGATAAC7V 
ATCATCGCTA 
AGAACCAGTT 
TTCGACAGAA 
GCCATTATAA 
ATTAATAATG 
ATCAACGCCA 
CGATTTAGTA 
TATTGCCACT 
TCCACCGAGA 
TATATATCCA 
CATGATTGTC 



GCTTCTTGTC 
AGTCTTTCAT 
TCGCTAAGGC 
GCTCTACTTT 
TGTGTTGTTA 
TTAATCAATT 
TGATCAATCA 
CCAATTGCTT 
TCAATGTTTA 
TTATAGTTTA 
TTTGATCCAA 
TTATCTTCAG 
TTAGTGTTGT 
TTATCTACTT 
ATATCATTTT 
TTTTTTAATA 
ATAACTGTTA 
CCACCACCAA 
GATAAAATTT 
GCAATTAATA 
ATTGTGCCAC 
ATGTCAATGT 
TCTTCTTTTC 
AACATTGCTG 
GGAATACAAC 
TTTTTACCTA 
ACGACTAAAT 
TTGAAAATTT 
TATTTAGTTT 



CTTGACCACT 
CTATTTTTCT 
CTAATGATTT 
CTGCTTTCAA 
GTCCTAATTC 
CTGTTACATG 
CCATTTCGCT 
TAAATGGTAC 
AACCGATAGA 
CTGGGATTGG 
CATGTGCCAA 
TTTGTTGAAA 
TTAAACCAAT 
TAATTATGTC 
CAGAAAGTTT 
GTTGTGAAGC 
CGTCAACACC 
TAATACCAAT 
TATCATGATC 
CAAATTGGTT 
TTTGAGGTGA 
GATTGTGTTG 
GTGCCAACAT 
CCTGTTTTAC 
CTTTATGGAG 
ATTGAGACGC 
CATATTGTTT 
ATTAATACAT 
GAATGCACAT 



TACAACAAAT 
ACCTAAATCC 
ATAATCAATC 
TCCTAATTCC 
TAATGCCGAG 
TGGACCAATC 
ATACCCTTCG 
TTTAAAACTT 
AGCAATTTCA 
GTTCCCCTCA 
TTGTAATTTT 
TTCGTTCGTT 
ATCTGATGTG 
TGAGGAAATT 
TATTCCCTCA 
TTGTTTACTT 
TAAATCTATC 
ACTTGATGGT 
AAATGATAAG 
GGGTAATAAG 
AAATATAGAT 
CATTAAATGC 
ATTTTCAAAA 
TGTTTGAAAT 
ACAAGTACCT 
ACGTATCGCA 
CTCTGACATG 
AGTTTTCATG 
AAATAAATCC 



GGAATTTTAC 
ATCCATTTAT 
ATGTTAAATC 
ATCAACACTT 
CCATTCATGA 
ATATTAATAC 
TTTGTGTCAT 
TTAACTTTCA 
GGTTGTGAAT 
AACATATGAT 
CCTATACAAT 
AAAATATGTC 
TTAGGTTTTC 
TCAAACGTAA 
TAGAATTTAA 
TCAGTTGGTA 
ATCAATGATG 
AACGTCTTTA 
AATGGCAACT 
TCTGATTCAC 
GTACCTAGAA 
TTTACACCTT 
TTAACATTAG 
ACTTCAGCAG 
CCTAATAGTT 
GCAACATATC 
TTCTTACTCC 
TCCATTAATT 
ATAAATGAGT 



CTGCACGGTT 
ATATTACTTT 
CTCCTATTTA 
CAGAGATGGA 
ACTGTAACAG 
CCACAATTTC 
GGCTATCAAT 
TTCCCTCTGC 
AAATACACTT 
CAACAGCCAC 
CACCAGCTGC 
CTGATGTTGa 
TACCAATCGA 
CACCATCTTC 
CACCACGTGC 
AAATTCTTTC 
CAAATTCCAT 
ATGATAATAT 
CTGCAGGCGA 
CATCTTCATA 
TACGTCCCGT 
GATACATTTG 
CATCTTTGAC 
ATTTAAGCAG 
GTCGTTCTAC 
CTGCAGTACC 
TAACTAATGA 
ACCTATTTTA 
ATTCAACACA 



14640 
14700 
147S0 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 16440 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16500 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 1656 0 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 1659 2 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 6 0 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 12 0 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 180 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 24 0 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 300 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 360 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 420 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 480 

ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 540 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 600 

ACAOTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 6 60 

ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 78 0 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 84 0 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 960 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 1020 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCOS TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 1320 

^ ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 144 0 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 15 00 

W 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 1620 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

15 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

AATAAOSCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 1800 

20 TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2 04 0 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

^° AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 222 0 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

35 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

AAAXCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 2460 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGC?rcCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 27 00 

GTATATTCTT TCX3TATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 27 60 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TTAGACTCAG TAGTAACCTG ACCACCACCT 2 82 0 

GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3 05 0 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 3240 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3 300 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3360 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 34 80 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3 54 0 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3 GOO 

ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 366 0 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 372 0 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3 840 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3 960 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4 020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4 080 

TGCAACTGCT GTAACGTCAt TGaCCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 4140 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4 200 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 42 60 

GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 43 80 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4 550 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 
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TACTAAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4 8 SO 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 92 0 

^ ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 980 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 504 0 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

W 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATnTTGTT TAATTATGCT 5220 

TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 52 80 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AOATTTCTTA 5340 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 5400 

SO GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 552 0 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 564 0 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 

GGCACACTGT CGAAGTCGAT ATCAATGATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 594 0 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGT^CCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

40 TTAATATCAA CGTGGCTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 618 0 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 624 0 

tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTITAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

^ TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

ACGTGACCTG ctTCGCTATC CACAGCAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 654 0 
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TCGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 66 6 0 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 572 0 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 6780 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6840 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6 900 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTC7V ATGAATAGCT TCCATTATTT 6960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT TAAATTTAGA AGTATCTGTC 7020 

GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTCGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 714 0 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 72 60 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 73 20 

TTTCCATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 73 80 

ACATCAACCT TATCTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 744 0 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 7500 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 762 0 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 7680 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 774 0 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 7800 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 7860 

CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7 980 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 804 0 

TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCCCATCCCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 8280 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 8340 
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ATGTTAATTG 
AACCCTTGTC 
ATCTAATTTA 
AACATGTCTT 
GGATTCTGAG 
TTGCAATTAC 
AAATGAAGAT 
TTTCTTTTAC 
GAGACAAAAT 
ATGATTGTAA 
CTAATGCGAT 
TACCTTCAAA 
ATGTACCGCC 
CTGATTCCAT 
ATAATACTGC 
ATAGATTTGT 
ATATGACTGG 
TAAATTCTTT 
TCATTTTTTG 
AATCATACCA 
CGGTCCTGGA 
TCCTAGTTTT 
TAAGACTAAA 
TGCCCATTGT 
ACCACCACCA 
GTGGCCGAGC 
ACCTAGCATT 
AAACCCAATA 
TATTTCGTTA 



ATAATTTTAT 
ACACAAGGCT 
AAACAATATA 
GAAACGCCTT 
TATTTCAGAC 
CTAAAAACAC 
GATACCTGAA 
AGTTAAACCA 
CACACTACCT 
TAATGGTAAG 
ACGTAGCAC7V 
CATTTTAGCA 
ACCGCCAATA 
AATATGATTC 
TATTAGCATG 
AGGTTTGTCA 
TAATGTTGCT 
TTGTGCACCT 
TGCAcTTTGT 
TACAGTAATA 
TGTGGTGGTA 
AACACTGAAA 
CCTACTTCAA 
ACATGTTTTT 
TCAGCAAGCA 
GTACTGCCCA 
AACGCTGTAA 
ATTAATACTA 
AACATGACAT 



TATTTGAAAT 
TGTATTTTTT 
CTAAACGTTT 
TCATTACTCT 
GATTTTCTGC 
GTTTACTTAA 
ACGGAAATAA 
AAATATTCTT 
GCACCTATCG 
ACAATACCTG 
GCTGCAACAA 
ATTGTATTTC 
ATCAATAACA 
ATCTTACGCT 
GCTGTCCCTG 
TGCCCAGTTA 
GTTAATAAAC 
AACGCTGAAA 
TAAATATAGG 
CATCTCCAAC 
AAAAGCCATG 
CATTTGCGCG 
AGAACAATGC 
GACCAAATTT 
ATTTCCCAAG 
TTCCTTTCTC 
TCATCGATGT 
ATAAAATAAC 
TCCCCTCTTT 



ATACCTATAA 
ATACTTATTT 
CATAATTATC 
AAAATACCCA 
ATAAAAATAA 
TATTTAGTTA 
TCGTTTCTAA 
TAAACATCCA 
CAAGTACAAC 
TAGTTGAAAT 
TCCATGCTAG 
CGACACCGCC 
TCATTCCGAT 
TTCTCATTAA 
CTGTTCCTAT 
CAAGTTGCGT 
TCATACCAAA 
TATCGCCTTC 
CCCTGCAATG 
ATTTGCCTTT 
TGTCACTGAT 
TTTTGCTACT 
AATACCGACG 
TTGAATCAAC 
TATGGCACCT 
AATCGTCTCC 
GATAATTAAT 
GATACCTAAA 
CTCTTTTCAA 



ATTGTATTCA 
TTTAAATTAA 
GCCTGTACAA 
ATATACTTTT 
ACGTGTTTCA 
AACAAATAAG 
TAATGACCAT 
AAATCCTGCG 
TAATGCAACA 
CGCAGCTACT 
TAAAATCGGA 
GTCAATTAAT 
TGGATAAATC 
TCCCATCGTA 
CATATAAATG 
TATCGTAGAC 
TCCTGGCATC 
TCGTGTATAC 
AGTGTAACTG 
AATTCTTTTG 
AAAGCTGTTA 
GTAAATACTA 
ATAAATGCTG 
GTGTCTGCGA 
AAACCGAATA 
ATAATTTTAG 
GAAATAAATG 
ACAACACTGA 
TAGAATGTAA 



AGTCATCAGA 
ATTCATCATT 
TACGCACAAA 
TATATCGTTC 
AGGCAATATA 
CTAATGAATA 
GTTAAGAATG 
TCATTTACAT 
TTTACATCTG 
GTAGCCGAAC 
GACATCTCTG 
ACTTGTTTAA 
GCATTCGTCA 
ACGATTGCAA 
ATAGATTCAA 
ACTAACATTA 
TCTTGATCCG 
GCAGACGGAA 
GaATGGCAAT 
CGATGACTAC 
CCATAGGTAG 
ATGGAATCAG 
CAACAAGCAT 
TTCGAGTTGC 
TCAGTGCAAT 
TCAATGGTAT 
TATTTAATTT 
TTAACGGCCA 
CACCGTCGTC 



8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA GCGATATGTT GGCGTTGAAA ATCTGCAATT 10260 

TGTTCATAAT TCTCTGTTAA AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

5 

TAAACAGTGA CATTTTCTTC AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 103 80 

ACGATTGAAA AATCTTCAAT GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

CATGAACTTT CATAACTTTC AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10500 

10 

TGACGCCATA CTTCACTTTT CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

TTCATTACTT CAATAAGCGC AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

,5 GCAGCS3CGAA TCATATGTTC TTTTTTATGA GATAAAGTTA AACCGAAGAA TGAACCTCTT 10680 

GCATTTGCGT TCCAAAGCGG CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TCTGCACCTG GTTTAACACG CTTTGCAATT TGAGTTAAGA CATCATAAGG ATCAACACCG 10300 

20 AGACGTTTCG CAGTTTCGAC TTCACTCGCT AGCAACTCGT CGCGCAACCA TCTCAATACG 10 360 

ACACCACCAT TATTTACAGG ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10 920 

AATATTCTAC CTTTGTAATC AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10980 

GTACCGATTG TGACAGCAAC TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 1104 0 

ACCCCATCAC TCGCACCAAT AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

TAACGTTCTT TCATACCTTT CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 11160 

30 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA ACATCCCAAT CTAATGTTTC TAAATTAAAC 1122 0 

ATCCCTGTTG CGGAAGCCAT TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 112 80 

ATGTATGTTT TAATATCTGC AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 1134 0 

TTCATCCAAA AAATCTTCGC TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 114 00 

TAAATCGCAT TGCCATCATG CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 11460 

40 TCTGCCCAAG TAATATTATT TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

TGCATTTGCG CACTAAATGA CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11530 

ATAACATATT TAATAGTCAT TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 11540 

''^ ACATCAACGT TTCGTGTGTG TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

TTTTCATCAT ATAAGACTGA CTTGGTACTC GTCGTTCCAA TGTCGACACC AATCATATAT 11760 

TTCATGATAA ATCCTTCTTT CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11820 

so 

CAACATCGTC GAAATTTAAA TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA H880 

CTGCATCAAT AAACACTTGA TGATTATGAT GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 
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AAAATGAGTT TAAATATTGA TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12 060 

CATGCTTCGT AAATGATTCT GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 12120 

5 

GTTTCTTCAT TTCTTTTACG ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 12180 

ATGACTCTAA CATCAGTCGC AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

ACACATGTGC ACCCATTCTT TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 12300 

to 

CATCTCGAAT TGGCGAACGA CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 12360 

ACGTACCTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

,5 CTCCAGTTGT CATACCTTCC AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 12430 

CTTCATTACA CGACATACTT GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12540 

CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12600 

20 TGGTCACTAT CACACGAATG ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

CTGTACTCAT ATGCGCTTTA GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12720 

ATTCGAATCG ACGTGTTGTC GCTGTATGTT TCGCTTTGAT AACTGCCCAC AAAGATGGTG 12780 

AGAATATATG CTGGCAGTTA GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 12 840 

CACCT^TGCC ATAACCAATC ATAAATGGTA AAGCAATTAA AAAOGGCCAT TTATTTTTCA 12900 

TCAAAATTGC ACTTATAATG CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 12960 

30 

AAATGTTACG ACGAATACTT TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13020 

TTAAGTGTGT GATTGGAGAC GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13080 

TGTTTAATAC CG C TTGTTGC TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 1314 0 

35 

GTAAAGTATT GAGCGTCTTC AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

ACGCfiCAACC TAAATCTTTA AGCAATAAGA TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

40 TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTTT CAAAATACCT TTGCGATCAT 13320 

AATATTGAAT CGTTCGTGTT GTCACATTGC ATAATTTTGC GAGTTCTCCA GTCGAATAGT 13380 

TAGACATAGA TTCCACCTCC TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

ACAATTTCCA CATTTTAAAG AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

CCATGTTGAT TTACAAACTC ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

TATTTCAGAA TGAATTTGTT GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13620 

50 

TTTCAGTAAA TCTCGATACT TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 13680 

TAAACTTGCC CACATATCCA TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 13740 

55 
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(2) INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 

5 <A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLCX3Y: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 60 

,5 TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 12 0 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 240 

20 TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 300 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG TCGTATAGCT 360 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 420 

25 TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 80 

ATGACATCAC ATATTCAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 600 

30 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GATACTGTGC 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 720 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 

35 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 840 

GAAT5TTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 900 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 

TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 

nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 1059 
45 (2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 024 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGY: linear 

55 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA 
ATTGTAGGTC CTGCATATCC ACAACAGGAT 

5 

TTAACCAAAA TCGATTGGGA AAACGTAACT 
ATTGATAAAG ATAATGATGC GTTGATTGCG 
GGTGGTAAAC ATGACAAATA ATGACACCAT 
TTCGATGTTA GAAGCTTTTC AATTAAGTGA 
GGAAAATATT ACAGCTGCAA TGTCTGATAA 
CAGGCAATGT GTGGCCTTTT TTACATTACA 
TAACCAAGAT GCAGTATTTT TCAGGTCATT 
AATAGGTAAA GTGGTAATGG AAAAATTGGC 

20 TAATGAGATT GTGTTAACGG TTAATACTGA 

ACAAGGATAT CAATATATGG GAGATAGTAT 
GTTAACTATA AAATAAATTA AATTTAAAAG 
AATGATGAAT AAAGGTGCTT TTTGTTATAG 
GACCTAGTGA ACAATTGACA TATATCCACA 
GCGATTAATT GATAGACTCA TCATTTTTGC 

^ CCGTAATCCA AGCCGTAATC GGAATACTGA 

AAATAAATTC TTGGGCAAAT ATTTTCGAGT 
TGAAAAACCA AATAAATAAA GCAAGTTGGC 

35 

CAGATGTCGC TAAAATTTCT CTACCAACAC 
TAACSTTgGA TTCACTTGAT GCAATTCATA 

^ TATCACAGCT GCAATAACAG CAAGAATAAT 

GCCAATATTC ATTGAATACA CATATGTTTC 
ATGACCGAAG TAGACCGATA AATAAATGAG 

45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT 

CGCCAATAAT AATGCAGAAA AAGAATGTGA 
ATACAATACT AATAATTAAA ATAGCGAAAT 

SO CCTTTTTACC TCCGAAAATT ATCATCAGAA 

CATTCATTGT TTCGCCCTCC TTAATGTTTC 

55 



ACAATGGATG AAATTGAATA TGTCGGGACA 6 0 

ATGTTAACTG AGTTAAATGG ATTTCGCGCA 120 

ATCAATAATG AAATTACGGA TATACGCTGG 180 

CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 240 

CATGTTACGA CATTATGTCC CACAAGATTA 300 

AAGTGATTTG AAGTTTGTTA AAACGCCAGA 360 

TGAAAGGTAT CCCATCGTTG TAATGGATGG 420 

TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 4 80 

TAGTGTTGAT CAACGTTATC GTAATAGAGG 540 

GTCATTTATC ACTTCAACAT TTCAGGATAT 600 

CAATCCACAT GCCATGGCAC TTTATCGCCA 660 

GTTCGTCGGA AGACCTGTTC ATATTATGGC 72 0 

CATCTTTACT CATCGTCGAC CACAACAATT 780 

ATCATCGGAC AATTTACTAT AGTAAAAAGC 84 0 

GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 900 

GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 960 

TTGCAACGGC AATACCGCCT AAAATAATAG 1020 

TTATAATATG ACCAAATGAA TATTTAAGTT 1080 

CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 114 0 

GCATGCCAGA TTGGAATAAT TCGTATTGCG 1200 

AATGGGTGAA CTAATGGTAA TTGTTAAATC 1260 

AGTGAACACC ATAAATTGAA CCATATCAAT 1320 

ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 1380 

TGTAATCAAC AATATTGTTG TAACGATAgT 1440 

TGTAACTATT GAGTACGAAT AAATTACAAG 1500 

CGACATAAAT CGGTACGCCA AAAATAATCA 1560 

TTAAAAATAG GGTTAAATAA GAGATGAATC 1620 

AGAGGAGCAA TAACGCCAAT ATAAATACT^G 1680 

AAATATTTCC ATAAACAATA TTGTGATAGG 1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG 1860 

TGCACCGGAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTCCCATAA TATCTTGGCC 1920 

^ GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1980 

TTCATGCATA CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 2 04 0 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

10 

TTTAATGCCT TTACCATCTG TCATATATAT GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

CAATAAAGTG AGTATTGTTG AACAGATCAT GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

15 

ATTGCTATGT TGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 234 0 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA 2400 

ACCAGTGATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCX3G 2460 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

25 TTTGACTTTT AATTGATTTT TATATTTAAT ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT G G TTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT 27S0 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA TTTCAAAAAC AAAACCCCAA TTCTATGAAT 282 0 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 2880 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGGAATGTAG ACGTTTTAGT 2 940 

CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

TGTiSATAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

40 

GAGATAcTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 3300 

45 

ATAGTGCTCT TCGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3 360 

AAGTTCCGTG AAACCAATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 3420 

so TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 34 80 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3 540 

SS 
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TTCATTAAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3S60 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3 780 

TCTTAATTGT TTAATTTCCA TAGCGATATA GGCACCTCCA AAAATGAGTG TTTTGTAACT 384 0 

ATTATAGCAA TATTATTGAT AAATGTTCTA TTTTTTAGAT GAATATCTTC TATTTTATAT 3900 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3960 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4080 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAGATCACAG GGGCGGGGTC 4140 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 4200 

TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4 320 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 4 380 

GCTAAACATG TAGCAGATAC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 4 44 0 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4 500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4 5 SO 

TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4 620 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4680 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4800 

AAAOTGTTTC AAATTGTCGA TGAGGATGGT AGTGACTCTG CCATTGTAGA TAATGCGCTA 4 860 

GAGTTCTTAT CGTTAGCCAT GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4920 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4980 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 504 0 

GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

AATAATGATT TAAAAGGTGC GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 528 0 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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CAGGAACTTG TAGAAGGTAA GAA3GATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 5460 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 5580 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 564 0 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 594 0 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6000 

GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TAC7VAGGCAC CGTTGTCGAT 6060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 612 0 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 6180 

CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 624 0 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTGCATCA 63 00 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAATCTATT 6360 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 64 80 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAACGCTT TAATACAGGG 6540 

GCGATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAA-ETAGCTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 702 0 

TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTCGTTTAG CAGAAACACA TCAAACATTA 7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 7 320 

GTTGCAACTC AAAACAAAGA TTTACGTGCT TTATATAGAG GTAAAGCACA TCATGTTGTT 7 3 80 

AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 74 40 

CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7560 

ACAAAAGAAA TTCAACAAAA TCATAATCTT GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCGCTATA CAGGTAGCTT TACAGTAAAT 7680 

AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 7860 

GGTAAAGGAT TATCTGGTGG TAOGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GAAATTATTG CTGGTAACX3T CTCATTCTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 798 0 

GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 804 0 

ATCGGCGACC ATGGATTAGA GTATATGACT GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 8 220 

GAAGAAAAAG CATTCATTAA GCAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 82 80 

AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 834 0 

CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 84 00 

GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 852 0 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

AAGCATATCA ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 864 0 

ATTGTGGAAC GCa3TTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTO 8760 

CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

CACCATGCGA AAGTGCTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8880 

TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 
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CTGAAGAACT 
GCGGTTTATT 
GTATTAAGTT 
ATATTGATAA 
AAAAAGGTAG 
ATTATTTAAC 
CAAAAGATAA 
CAGCATTAAG 
AAGCT^TTAC 
ACTATGCGCA 
AAACAATGCG 
TAGAGCAAGG 
ACCTTGTATT 
ACATTAAAAC 
AAAAGGTATT 
AAGAAGGTAG 
AATCTTTGTA 
AAATCTAGTA 
GACAACAAAT 
GTGGAAGACA 
AACOTTCATA 
CGGTCTTGAA 
CCATCAATAT 
TTACCTTTTT 
TTGAGATTGG 
GTGAAATATG 
ATTTAGGAGT 
AATAACGCAA 
TGAAATGGCT 



TAATCTACTA 
AATGTATGGT 
AATGGAAGAA 
AGCAACGTTA 
AGATTTACCT 
TGAACAAACG 
GAATGTCATT 
AGAAAATTGT 
ATTTACAGAA 
CCAAGAGTAC 
TTACGATGTT 
CGAAAATGGT 
ATTATCAATC 
GGATAGAAAT 
TGCTGCTGGA 
AGGCGTAGCG 
TGGAAATGGT 
TCTATCAACG 
AGATTAATTA 
ATGATTTGTG 
TATGATAAAT 
AACCGACAGG 
TTATATTAAA 
TATTTGTCTT 
AAGGGCATTT 
ATGGATGGCT 
TGGCCATGCA 
TTGTAGCGAG 
ACCCCAGATA 



GGATATCAAG 
ATTCCGAATA 
GCGGGCATTA 
GAATCTGAGT 
TTAGAAGGAC 
CAGTTGTTAA 
ATCATTGGTG 
AAATCGATTG 
AATGCATCAT 
GAAGCTAAGT 
GACGATAAAG 
ATGGTCATGA 
GGCTTCGAAG 
CGAATCGTGG 
GATGCTAGAC 
AAAGCAGTAG 
GGTTACGTTG 
TCACATGCCA 
TAAGTAGTGA 
GTAATCATGT 
ATTGTGTTTA 
GGCTTAACGG 
TTCTATATAT 
TGAATGGCTC 
GGCTTGTGCA 
TGTGTGGACA 
TCTACACTTT 
GAGTTATTGC 
ATTGTGACAA 



TAACTATTTA 
TGAAACTTGA 
CTTTCATTAA 
ATGATGCCAT 
GCATGGGTGA 
ATGGAGAAAT 
CTGGTGATAC 
TTCAATTTAA 
GGCCTTTAGC 
TTGGTAAGGA 
GACACATACG 
AAGAAGGACC 
GTACAGAACC 
CGGATGATAC 
GTGGTCAAAG 
ATCAGTATTT 
ACGTTGTGAC 
TCTTTGTAAC 
TTTTTTACAT 
AATGCTTAAA 
GGAGGAATAC 
CTCGCS3GGGG 
AATGAAGGTA 
GTAATTTTTG 
ATATACATAG 
AGTTTGCTAT 
ATAATGGTGA 
TACATATGTC 
AATAAAAATA 



TGAACGTGCT 
TAAAGATGTG 
TGGTGTTGAA 
TATATTATGT 
TGGTATACAT 
TGATGATATA 
AGGGGCAGAC 
TAAATATACG 
AATGCCGGTG 
ACCACGTGCA 
TGGTTTGTAT 
TGAAAGATTT 
AACAGTACCG 
AAACTATCAA 
TTTAGTTGTA 
AGCTAGTAAA 
ATGCTGAATC 
CTAAAAACAA 
TCGTTTATAG 
AACAATATTG 
CCAAGTCCGG 
TTCGAATCCC 
AGTGCTCAAA 
ATAATAGAAA 
CTAAATGTCT 
TTATAGATAT 
GAGCGTGGTG 
GTTATGGCTC 
TTTTGTTGAA 



AGAGAATCAG 
GTTCGACGTC 
GTCGGTGTTG 
ACTGGTGCAC 
TTCGCTATGG 
ACAATAACTG 
TGTGTAGCGA 
AAATTGCCAG 
TTTAAAATGG 
TATGGTGTTC 
ACTCAAATTT 
TGGCCTGCTG 
AATGCTTTTA 
ACTAATAATG 
TGGGCAATTA 
GTTTGTGTAT 
GAGTTTGAAA 
AGGTTTGTAA 
GTCAACTGTA 
ACTTTTACAG 
CTGAAGGGAT 
TCTTCCTCCG 
TTTTGAGTAT 
TGATAAGGCA 
TTTTTGTTTT 
GCATTTTTCA 
AGGTATTGTT 
ATTGATTTTC 
AGCCTTTACA 



9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
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TAAAAAGAGA AGATGTAAAA GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 10860 

CTGCAACGCA TTGTGTAACA CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10920 

AAGACGCATT AAGTAATAAC GCGTTGGTCA AGGGGCAGTT TAAAGCAGAC CATCAATATC 1098 0 

AAATTGTCAT TGGTCCAGGA ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 1104 0 

GTGCTCAAGA AGCTTCGAAA GATGAAGCGA AACAAGCAGC TGCACAAAAA GGGAATCCAG 11100 

TACAACGTTT GATCAAATTG TtGGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

CAGCTGGTTT GTTAATGGGA ATCAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

AAGCACTTAT TGAGATGTAT CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 11280 

CGAGTACGGC ATTTATTTTC TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 1134 0 

GTGGTAGTCC GATTCTAGGC ATAGTCTTAG GnTGATTTT AATGCATCCG CAATTAGTAT 114 00 

CTCAGTATGA TTTGGCAAAA GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 11460 

AGCAGTTGAA TTACCAAGGT CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 11520 

AAATTGAAAA AGGATTAAAT AAAGTCGTTC ACGATTCX3AT AAAAATGTTG GTCGTTGGAC 11580 

CCGTAGCGCT TTTAGTTACT GGATTTTTAG CATTTATTAT CATTGGACCA GTTGCGTTAT 1164 0 

TGaTTGGTAC AGGTATTACA TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 11700 

GCGGAGCAAT ATATGGATTG TTATATGCAC CACTTGTAAT TACAGGACTA CACCATATGT 11760 

TTTTAGCAGT AGATTTCCAA TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 11820 

TTGTTGCGAT TTCCAATATT TGTCASGGCT CTGCAGCATT TGGAGCATGG TTTGTCTATA 11880 

AACGTCGTAA AATGGTTAAA GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 11940 

TAGGTGTTAC TGAACCAGCC ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12 000 

CTGCGATATC AACGTCTTGT GTATTGGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12 060 

AAGTTGGTGT TGGTGGCGTG CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 12120 

ATCTTATTGT GACAGCTATT GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 12180 

ATTTTAGTAA ACAAAAAGCG AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 1224 0 

TCGTTATTTG GACGTCCTTT ATTACGTTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 12300 

ATTGGAGAAA ATCCGTTGTA TATCAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 12360 

ATGGTATAGG AGATATCAAT GGAATTATAG AAAAATTGGA TTATATCAAG TTATTGGGTG 12420 

TTGATTATAT TTGGTTAACA CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 12480 

TCAGCAATTA TTTAGAAATC aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 126S0 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

GTAATGCATG GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 1278 0 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACGTCA ATCGTTATAT CGCATAGTCA 12 840 

ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA 12 900 

AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12 960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 13 020 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TGATTATGTT GATGGTGAAA 1314 0 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

GAGGTATTTA TGACGGTGGC GGATCGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13260 

GGGTAGTGTC TAGATTTGGT GATGATACGT CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA 13 3 80 

TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13 500 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13560 

GATTTACAGC TGGTAAnCCT TGGATTGATA TTTCGGAAAA TTATCATCAG GTCAACGTTA 13 620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 13 680 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATTTATTTGT TTATGAACGT CATTATAAGA ATCAACAATG GCTAGTAATT GCGAATTTCT 13 800 

CAGCaTCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13 860 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13920 

CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCX3AAACA AAAAAAGTTT ATGAAGATTT 13980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

ATTTGTTGGC ATTAGACXK3C ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

ATCAGGAGGT TACAGAGTTT CGATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14220 

AAATGGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14280 

TTCCAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 1434 0 
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TTGTTTCAGA TATAGGTAAT GATGTTGCGA GTGATTCTAT TTATGATTAT TTGGAAAAGG 144 60 

TATTAAATCT TAATATTAGT TATTCAAGTA AGTCTATTAC TTTTGAACCG TTTGATGAAC 14 520 

AAGCATATCA ATTGTTTGGT GATGTATCGG TGGCTTATTC AGCAACAGTT CGAAGTATTG 14 580 

TGTATTTAGA AAATACAATG CCGTTTCAAT ATAATATTTC AAAACATCTT GCAAATGAAT 14 640 

TTAAATTTAA TGACTTCTCA AGACGTCGTA TAAAGTAAAC AATGATATAA ATGATTTATA 14 700 

CTTGCAATTA ACTATTAAAA TATAGTAATA TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14 760 

CGGTTCCCTG TACTCGAAAT CCGCTTTATG CGAGGCTTAA TTCCTTTGTT GAGGCCGTAT 14 820 

TTTTGCGAAG TCTGCCCAAA GCACGTAGTG TTTGAAGATT TCGGTCCTAT GCAATATGAA 14 880 

CCCATGAACC ATGTCAGGTC CTGACGGAAG CAGCATTAAG TGGATCATCA TATGTGCCGT 14 94 0 

AGGgTAGCCG AGATTTAGCT AACGACTTTG GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

GGTGCACGGT TTTTTATTTT TTAAATATTA AACCGATTAT TAAGAGTTGA AAATATATAA 15060 

TTATAGAAGC TACTTTCTTG AAGACAATTC AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

AAGTAGCTTT TTTATATGTG AAGTTTGATT CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

TTTTGTGTCA ATGAAAAGTA AGAAGTTATA ATTTGATGAT AAAGAAATGA TGGTGAAATG 1524 0 

AGGGGGAGTA TCTTACAATA GAATTATTAA TGAGATACGT TATGATTATT GACAATCAAA 15300 

TGCCTACGGA GGACATATGC AAATATATTT AAGTACTTTA ACAGAGTTAG ATTATGATAA 15360 

ATCTTTAAAT AGTATTGAAG AAAGTTTTGA TGATAATCCT GAAACGAGTT GGCAAGCACX; 15420 

TGCGAAAGTA AAACATTTAA GAAAATCTCC TTGCTATAAT TTTGAATTAG AAGTAATAGC 154 BO 

GAAAAATGAA AATAACGATG TCGTTGGACA CGTTTTATTA ATTGAAGTAG AAATTAATAG 15540 

TGATGATAAG ACGTATTATG GTTTGGCGAT TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

TGGACAAAAA TTAGGTCGTG GCTTGGTTCA AGCAGTAGAA GAGCGTGCCA AAGCACAAGA 15660 

GTATAGTACG GTTGTTGTAG ACCATTGTTT TGACTACTTT GAAAAGTTGG GTTATCAAAA 15720 

TGCTGCTGAG CATGACATTA AATTAGAATC TGGTGATGCA CCGTTACTTG TAAAATATTT 15780 

ATGGGATAAT TTGACGGATG CACCACACGG AATCGTAAAA TTTCCAGAAC ATTTTTATTA 15840 

ATTGTTCAAT TAAGAAGTAA AGGTATTATC ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

GGTGCTAACT TGAATTATCA AGCCTTATAT CGTATGTACA GACCCCAAAG TTTCGAGGAT 15960 

GTCGTCGGAC AAGAACATGT CACGAAGACA TTGCGCAATG CGATTTCGAA AGAAAAACAG 16020 

TCGCATGCTT ATATTTTTAG TGGTCCGAGA GGTACX3GGGA AAACGAGTAT TGCCAAAGTG 16080 

TTTGcTAAAG CAATCAACTG TCTAAATAGC ACTGATGGAG AACCTTGTAA TGAATGTCAT 16140 
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AATAATGGCG TTGATGAAAT AAGAAATATT AGAGACAAAG TTAAATATGC ACCAAGTGAA 162 SO 

TCGAAATATA AAGTTTATAT TATAGATGAG GTGCACATGC TAACAACAGG TGCTTTTAAT 16 3 20 

^ GCCCTTTTAA AGACGTTAGA AGAACCTCCA GCACACGCTA TTTTTATATT GGCAACGACA 16 3 80 

GAACCACATA AAATCCCTCC AACAATCATT TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 15440 

ATTAGCCTAG ATCAAATTGT TGAACGTTTA AAATTTGTAG CAGATGCACA ACAAATTGAA 16 500 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16560 

TTAAGTATTA TGGATCAGGC TATTGC7VTTT GGTGATGGTA CGTTAACATT GCAAGATGCG 16620 

TTGAATGTCA CAGGTAGCGT ACATGATGAA GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

15 

CAAGGTGACG TACAAGCATC TTTTAAAAAA TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

GTGAATCGCC TAATAAATGa TATGATTTAT TTTGTCaGAG ATACGATTAT GAATAAAACA 16 BOO 

TCTGAGAAAG ATACTGAGTA TCGAGCACTG ATGAACTTAG AATTAGATAT GTTATATCAA 16 860 

20 

ATGATTGATC TTATTAATGA TACATTAGTG TCGATTCGTT TTAGTGTGAA TCAAAACGTT 16 920 

CATTTTGAAG TGTTGTTAGT AAAATTAGCT GAGCAGATTA AGGGTCAACC ACAAGTGATT 16 980 

25 GCGAATGTAG CTGAACCAGC ACAAATTGCT TCATCGCCAA ACACAGATGT ATTGTTGCAA 17 040 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

CCTGTTCAAA AATCTTCGAA AAAGCCTGCG AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

30 TCAATGCAAC AAATTGCAAA AGTGCTAGAT AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

AAAGATCATT GGCAAGAAGT GATTGATCAT GCCAAAAATA ATGATAAAAA ATCACTCGTT 1728 0 

AGTTTATTGC AAAATTCGGA ACCTGTGGCG GCAAGTGAAG ATCACGTACT TGTGAAATTT 1734 0 

35 GAGGAAGAGA TCCATTGTGA AATCGTCAAT AAAGACGACG AGAAACGTAG TAGTATAGAA 174 0 0 

AGTGTTGTAT GTAATATCGT TAATAAAAAC GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

TGGCAAAGAG TTCGAACGGA ATATTTACAA AATCGTAAAA ACGAAGGCGA TGATATGCCA 17520 

AAGCAACAAG CACAACAAAC AGATATTGCT CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ACTGTACATG TGATAGATGA AGAGTGATAC ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

AAAGAAACAT CATTTTATTG ATAAATATTT ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

45 

GCGGTGGCGG AAACATGCAA CAAATGATGA AACAAATGCA AAAAATGCAA AAGAAAATGG 177 60 

CTCAAGAACA AGAAAAACTT AAAGAAGAGC GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 

TTGCAGTTAC TGTAACTGGT CATAAAGAAG TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 

SO 

TAGACCCAGA CGATATTGAA ATGCTACAAG ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC CAGAACCTAT ATCAAAACTT ATTGATAGCT 18 060 

TTATGAAATT GCCAGGCATT GGTCCAAAGA CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

S ATATGAAAGA AGACGATGTT GTTCAGTTTG CCAAAGCATT AGTAGATGTT AAGAGAGAAT 1818 0 

TAACATATTG TAGCGTATGT GGTCACATTA CTGAAAATGA TCCATGTTAT ATTTGTGAAG 1824 0 

ATAAGCAAAG AGATCGTTCA GTTATTTGTG TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

'° TGGAAAAAAT GAGAGAATAC AAAGGTTTAT ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TGGATGGCAT TGGACCAGAA GATATTAATA TTCCTTCATT GATTGAACGC TTGAAAAACG 184 20 

ATGAAGTTAG CGAATTAATC TTAGCTATGA ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

15 

TGTATATTTC TAGATTAGTT AAGCCTATAG GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

TATCGGTAGG TGGCGATTTA GAGTATGCTG ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

GTAGAACAGA AATGTAATkT CTTCTATTAA ACATTTTTGA TTTTAATACT ATAGTAAGAA 18 660 

AAGTCACAGT GTAATCATTG TGGCTTTTTT TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

GCGGTGTGGC GGTGGTATGG TTTACCTAGT TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

CAAGCCGTTG GTTGTGATTT GTTACTTCTA ATAGTAATGA TGTGAATTGG ATTATCGAAT 1884 0 

25 

TAGATCTATG GTTATGGTGT GTTGGTGCTA TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

CAAATGAAAT TCTTTTGTAA TTGAAATGAT AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18 96 0 

3^ GGTCTAAAGC TTATTAAATC AGCCTGTATA GCGGTGTTTT GAGAGATTAT TTAAAACTTG 19020 

TAAATTTATT TTTAATTTCT GGTAAAAAAA TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

ATATGGTTAG AGAAAAATCT GTTTCTTGTT CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

3S TTTTTAAGTT CGATTTTTAG GATAAGGGCG TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CTCAACTTAA GAAATAACTT GAATTACTAA CGAAAATTAA TTTTAAAAAG TTATTGACTT 193 20 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AATGACAATA TGTCAACGTT AATTCCAAAA AACGTAACTA TAAGTTACAA ACATTATTTA 194 4 0 

GTATTTATGA GCTAATCAAA CATCATAATT TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

■'^ GAACGCTGGC GGCGTGCCTA ATACATGCAA GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CTGATGTTAG CGGCGGACGG GTGAGTAACA CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ACTTCGGGAA ACCGkAGCTA ATACCGGATA ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

50 

AGACGGTCTT GCTGTCACTT ATAGATGGAT CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT 
gCtGaCGGAG 
5 GGGAAGAACA 
GGCTAACTAC 

TGGGCGTAAA 
GTGGAGGGTC 
GTAGCGGTGA 
TGTAACTGAC 

15 

CCACGCCGTA 
AACGCATTAA 
GGGGACCCGC 

20 

CAAATCTTGA 
GACAGGTGGT 
CGAGCGCAAC 

25 

GTGACAAACC 
TACACACGTG 
CATAAAGTTG 
CTAGTAATCG 
CGTCACACCA 

35 CGTCGAAGGT 
GCGQCTGGAT 
ATAACGTGAC 

40 TAAAGTGATA 
TACATTGAAA 
AAAGAGTTTT 

''^ CACAAGATTA 

TGACTTATAA 

GGCACTAGAA 

50 

AGCTTTGATC 



CCAGACTCCT 
CAACGCCGCG 
TATGTGTAAG 
GTGCCAGCAG 
GCGCGCGTAG 
ATTGGAAACT 
AATGCGCAGA 
GCTGATGTGC 
AACGATGAGT 
GCACTCCGCC 
ACAAGOGGTG 
CATCCTTTGA 
GCATGGTTGT 
CCTTAAGCTT 
GGAGGAAGGT 
CTACAATGGA 
TTCTCAGTTC 
TAGATCAGCA 
CGAGAGTTTG 
GGGACAAATG 
CACCTCCTTT 
ATATTGTATT 
TTGCTTATGA 
ACTAGATAAG 
AAATAAGCTT 
ATAACGCGTT 
AAATGGTGGA 
GCCGATGAAG 
CAGAGATTTC 



ACGGGAGGCA GCAGTAGGGA 
TGAGTGATGA AGGTCTTCGG 
TAACTGTGCA CATCTTGACG 
CCGCGGTAAT ACGTAGGTGG 
GCGGTTTTTT AAGTCTGATG 
GGAAAACTTG AGTGCAGAAG 
GATATGGAGG AACACCAGTG 
GAAAgCGTGG GGATCAAACA 
GCTAAGTGTT AGGGGGTTTC 
TGGGGAGTAC GACCGCAAGt 
GAGCATGTGG TTTAATTCGA 
CAACTCTAGA GATAGAGCCT 
CGTCAGCTCG TGTCGTGAGA 
AGTTGCCATC ATTAAGTTGG 
GGGGATGACG TCAAATCATC 
CAATACAAAG GGCAGCGAAA 
GGATTGTAGT CTGCAACTCG 
TGCTACGGTG AATACGTTCC 
TAACACCCGA AGCCGGTGGA 
ATTGGGGTGA AGTCGTAACA 
CTAAGGATAT ATTCGGAACA 
CAGTTTTGAA TGTTTATTTA 
AAATAAAGCA GTATGCGAGC 
TAAGTAAAAT ATAGATTTTA 
GAATTCATAA GAAATAATCG 
TAAATCTTTT TATAAAAGAA 
AACATAGATT AAGTTATTAA 
GACGTTACTA ACGACGATAT 
CGAATGGGGA AACCCAGCAT 



ATCTTCCGCA ATGGGCGAAA 
ATCGTAAAAC TCTGTTATTA 
GTACCTAATC AGAAAGCCAC 
CAAGCGTTAT CCGGAATTAT 
TGAAAGCCCA CGGCTCAACC 
AGGAAAGTGG AATTCCATGT 
GCGAAGGCGA CTTTCTGGTC 
GGATTAGATA CCCTGGTAGT 
CGCCCCTTAG TGCTGCAGCT 
TGAAACTCAA AGGAATTGAC 
AGCAACGCGA AGAACCTTAC 
TCCCCTTCGG GGGACAAAGT 
TGTTGGGTTA AGTCCCGCAA 
GCACTCTAAG TTGACTGCCG 
ATGCCCCTTA TGATTTGGGC 
CCGCGAGGTC AAGCAAATCC 
ACTACATGAA GCTGGAATCG 
CGGGTCTTGT ACACACCGCC 
GTAACCTTTT AGGAGCTAGC 
AGGTAGCCGT ATCGGAAGGT 
TCTTCTTCAG AAGATGCGGA 
ACATTCAAAT ATTTTTTGGT 
GCTTGACTAA AAAGAAATTG 
CCAAGCAAAA CCGAGTGAAT 
CTAGTGTTCG AAAGAACACT 
CGTAACTTCA TGTTAACGTT 
GGGCGCACGG TGGATGCCTT 
GCTTTGGGGA GCTGTAAGTA 
GAGTTATGTC ATGTTATCGA 



19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
205B0 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21160 
21240 
21300 
21360 
21420 
21480 
21540 
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GAGGAAGAGA 


AAGAAAATTC 


GATTCCCTTA 


GTAGCGGCGA 


GCGAAACGGG 


AAGAGCCCAA 


21660 


ACCAACAAGC 


TTGCTTGTTG 


GGGTTGTAGG 


ACACTCTATA 


CGGAGTTACA 


AAGGACGACA 


21720 


TTAGACGAAT 


CATCTGGAAA 


GATGAATCAA 


AGAAGGTAAT 


AATCCTGTAG 


TCGAAAATGT 


21780 


TGTCTCTCTT 


GAGTGGATCC 


TGAGTACGAC 


GGAGCACGTG 


AAATTCCGTC 


GGAATCTGGG 


21840 


AGGACCATCT 


CCTAAGGCTA 


AATACTCTCT 


AGTGACCGAT 


AGTGAACCAG 


TACCGTGAGG 


21900 


GAAAGGTGAA 


AAGCACCCCG 


GAAGGGGAGT 


GAAATAGAAC 


CTGAAACCGT 


GTGCTTACAA 


21960 


GTAGTCAGAG 


CCCGTTAATG 


GGTGATGGCG 


TGCCTTTTGT 


AGAATGAACC 


GGCGAGTTAC 


22020 


GATTTGATGC 


AAGGTTAAGC 


AGTAAATGTG 


GAGCCGTAGC 


GAAAGCGAGT 


CTGAATAGGG 


22080 


CGTTTAGTAT 


TTGGTCGTAG 


ACCCGAAACC 


AGGTGATGTA 


CCCTTGGTCA 


GGTTGAAGTT 


22140 


CAGGTAACAC 


TGAATGGAGG 


ACCGAACCGA 


CTTACGTTGA 


AAAGTGAGCG 


GATGAACTGA 


22200 


GGGTAGCGGA 


GAAATTCCAA 


TCGAACCTGG 


AGATAGCTGG 


TTCTCTCCGA 


AATAGCTTTA 


22260 


GGGCTAGCCT 


CAAGTGATGA 


TTATTGGAGG 


TAGAGCACTG 


TTTGGACGAG 


GGGCCCCTCT 


22320 


CGGGTTACCG 


AATTCAGACA 


AACTCCGAAT 


GCCAATTAAT 


TTAACTTGGG 


AGTCAGAACA 


22380 


TGGGTGATAA 


GGTCCGTGTT 


CGAAAGGGAA 


ACAGCCCAGA 


CCACCAGCTA 


AGGTCCCAAA 


22440 


ATATATGTTA 


AGTGGAAAAG 


GATGTGGCGT 


TGCCCAGACA 


ACTAGGATGT 


TGGCTTAGAA 


22500 


GCAGCCATCA 


TTTAAAGAGT 


GCGTAATAGC 


TCACTAGTCG 


AGTGACACTG 


CGCCGAAAAT 


22560 


GTACCGGGGC 


TAAACATATT 


ACCGAAGCTG 


TGGATTGTCC 


TTTGGaCAAT 


GGtAGGAGAG 


22620 


CGTTCTAAGG 


GCGTTGAAGC 


ATGATCGTAA 


GGACATGTGG 


AGCGCTTAGA AGTGAGAATG 


22680 


CCGGTGTGAG 


TAGCGAAAGA 


CGGGTGAGAA 


TCCCGTCCAC 


CGATTGACTA 


AGGTTTCCAG 


22740 


AGGAAGGCTC 


GTCCGCTCTG 


GGTTAGTCGG 


GTCCTAAGCT 


GAGGCCGACA 


GcGTAGGCGA 


22800 


TGGATAACAG 


GTTGATATTC 


CTGTACCACC 


TATAATCGTT 


TTAATCGATG 


GGGGGACGCA 


22860 


tAGGATAGGC 


GAAgcGTGcG 


ATTGGATTGC 


ACGTCTAAGC 


AGTAAGGCTG 


AGTATTAGGC 


22920 


AAATCCGGTA 


CTCGTTAAGG 


CTGAGCTGTG 


ATGGGGAGAA 


GACATTGTGT 


CTTCGAGTCG 


22960 


TTGATTTCAC 


ACTGCCGAGA 


AAAGCCTCTA 


GATAGAAAAT 


AGGTGCCCGT 


ACCGCAAACC 


23040 


GACACAGGTA 


GTCAAGATGA 


GAATTCTAAG 


GTGAGCGAGC 


GAACTCTCGT 


TAAGGAACTC 


23100 


GGCARAATGA 


CCCCGTAACT 


TCGGGAGAAG 


GGGTGCTCTT 


TAGGGTTAAC 


GCCCAGAAGA 


23160 


GCCGCAGTGA 


ATAGGCCCAA 


GCGACTGTTT 


ATCAAAAACA 


CAGGTCTCTG 


CTAAACCGTA 


23220 


AGGTGATGTA TagGGcTGAC GCCTGCCCGG TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 


23280 


CTGCGAAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 


23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA GTACCTGTGA AGATGCAGGT TACCCGCGAC 2 3 460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 2 3 520 

^ TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 2 3 580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 2 3 640 

TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 23700 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23 760 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 23 820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23880 

75 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 23 940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24060 

20 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24240 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 24300 

AATCAAAATA AATGTnTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 24360 

30 ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 24420 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 244 80 

AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24 540 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 24 6 00 

AAACDTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 6 SO 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24 720 

TAGCGAsGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24780 

AATGTACACT TTCGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 24840 

AAATGATATC ATCGAAAACA AAATATTGTA TAAATAGAGA AGAGCAGTAA GACGGTATCT 24900 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24960 

TTATTATACA ATAGACAAGC TATTGCATAA GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT TTTCTTATAG TCTTGATCAT CATCAAAATT 25140 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT 
CTCTTTGTAA GGTTTCAATC TTTTTAAAAT 

5 CATTTTATTT AAATGCCTTT CAAAACCACC 

CATATGAACA GGTACAGTGT TGCCTTCAAT 
AGAATAGGGT AACACCATAT ATGCAACTCG 

'° ACTGATATAA ATCACTTTTT CTCCTCTTGA 

GCCGAAATAT CTAAACTCGG AATCATAATC 
TTGAGCCCAT TGTATTAA1T GAGTTCGTTT 

IS 

TTGATGGGAA GGCGTTATAT ATACTATATT 
TATTCCATTA TCTTCAACTT CAATTTGTTC 
TTTGATTGGT GGATAACTAG GTTTTTCGAT 

20 

TAATTGATTT ACTAATTGTT CGGTAGATGA 
TACGCCACGA TTAGTAAATA AATAAAATGC 
AAAATGTCCT CTACGTAATT GATTTAAATG 

25 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT 
ATAAGCTTCA TCACTCGCTT TTGGTTTATA 

3P TTGATTGTTT AAAATTGTTA AAGATTCAAT 

TGAATAAATG TAACCTTCGT CTAATAGAAG 
AATAGATAAA TGTTTGCTTA ATTGTCTTTT 

35 ACCTTCAATT ATTTGTTTTT TTAATTTTTC 

TT'TOTAACT GACCTCCTAA ATTTATCTTA 
ATTACAATGT ATTTAATCAA CTTGAAAAGG 

40 ATCAGACAGA GTCAAAAGAG GTATGGCTGA 

CGTTAATGCT GAGCAAGCAA GAATTGCAGA 
AGAACGAGTA CCTTCTGATA TTAGAGCTGC 

''^ AATTGTAGAA GAAGTAATGA ATGCTGTTTC 

TCATATCACT GAAGCAAGAG TATTAGAGGC 
AGTGTTAACA CCAGCAGATG AGGAATATCA 

50 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm 



ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

GGAAGATATA AACGTTGCAA TAAGGTTTTG 2 53 80 

GTGATTTTGA GAATGATATT TTTTCATTAT 2 544 0 

ACAGCTAGGA AAAATAGACT TTGAAAATGT 2 5 500 

ATATAGACCT TGAATTGCTG GAATGGGTTT 25 560 

ATCTTCTATA ATAAATCGTT CTTCTTTTTC 2 5 620 

TTTTAAGTCC ATCACATATC CAGTTGGAAA 2 5S80 

TTTTTGTGAT TTAATAACTT CATCTACGTT 2574 0 

ATATTCAACT TGTTTTTTAT CTAAAATATT 253 00 

AATAAATGTT GAAGTATAAA GTAAATCGAC 2 58 GO 

GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25980 

ATTTGTATCA TAAAGATCTT TGGAATACTT 26040 

ATCTATTTCA TCCAAATTAA AAGCATAATC 2610 0 

TGAATCATCA TCAAAAAGAG AGGGGATAGG 261 SO 

TTCGGACACA AAATATCCAG AGCGAGGTCT 26220 

TTGATATGCA TGCTCTACGG TTGTTTGGCT 26280 

AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 264 00 

TTTTGTACCT TTTTAAATAT CAGTTTATAC 26460 

GGTTTTATGT ATAATGAGTA AAATTATTGG 2 6 520 

AATGCAAAAA GGCGGCGTTA TTATGGATGT 26 5 80 

AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC TGTAATGAAT GATGATGAGA TTATGACTTT 2 706 0 

TGCGAAAGAT ATCGGTGCGC CTTATGAAAT TTTAAAACAA ATTAAAGACA ATGGTCGTTT 2712 0 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT TGCGACTCCT CAAGATGCTG CTTTAATGAT 2718 0 

GGAATTAGGT GCTGACGGTG TATTCGTTGG ATCAGGTATT TTTAAATCAG AAGATCCAGA 2 7240 

AAAATTTGCT AAAGCAATTG TTCAAGCAAC AACACATTAC CAAGACTATG AACTAATTGG 273 00 

AAGATTAGCA AGTGAACTTG GCACTGCTAT GAAAGGTTTA GATATCAATC AATTATCATT 27 36 0 

AGAAGAACGT ATGCAAGAGC GTGGTTGGTA AGATATGAAA ATAGGTGTAT TAGCATTACA 2 742 0 

AGGTGCAGTA CGTGAACATA TTAGACATAT TGAATTAAGT GGTCATGAAG GTATTGCAGT 27480 

TAAAAAAGTT GAACAATTAG AAGAAATCGA GGGCTTAATA TTACCTGGTG GCGAGTCTAC 27 540 

AACGTTACGT CGATTAATGA ATTTATATGG ATTTAAAGAG GCTTTACAAA ATTCAACTTT 27600 

ACCTATGTTT GGTACATGCG CAGGATTAAT AGTTCTAGCG CAAGATATAG TTGGTGAAGA 2 7S60 

AGGATACCTT AACAAGTTGA ATATTACTGT ACAACGAAAC TCATTCGGTA GACAAGTTGA 2 7 720 

CAGCTTTGAA ACAGAATTAG ATATTAAAGG TATCGCTACA GATATTGAAG GTGTCTTTAT 2 7780 

AAGAGCCCCA CATATTGAAA AAGTAGGTCA AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27840 

GAAAATTGTA GCTGTTCAGC AAGGTAAATA TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

AGATGACTAT AGAGTAACTG ATTACTTTAT TAATCATATT GTAAAaAAAG CATAGCTTAA 2 7960 

TGTATGCTAA ATCAACGAAT TATTGATATT TATAGATTTG TTGAGAAGAA AATATCTCCT 28020 

TCAAACTTAG CTTTGGAGGA GTTATTTTTT ATGTCAAAAT TAAAAATGAT AAAAAATAAA 2 8080 

GCTATACATA AGAAAAAAAC CCTTCAAAGA GACTGAGAAT AGTCAAAATT TTGAAGGGGT 2814 0 

TAATTCGATG TTGATGTATT TGTTAAATAA AGAATCcAGC GATTGCAGCT GAAATGAAAG 28200 

ATACTAGTGT tGCACCGAAT AATAATTTCA AACCAAAGCG GGCAACTGTA TCTCCTTTTT 28260 

TGTCATTAAG TGATTTAATC GCACCTGAAA TAATACCGAT AGAGCTAAAG TTAGCAAATG 2 8320 

ATACTAAGAA TACAGATGTA ACACCTTTTG CGTGTTCAGA TAAATCACTA AGTTTACCAA 28380 

GTGCTTGCAT TGCTACAAAT TCGTTAGATA ATAGTTTTGT CGCCATAACT GAACCGGCTT 2844 0 

GAACTGCATC TTGCCATGGC ACACCGACTA AGAATGCAAA TGGTGCAAAG ACAAAACCAA 28500 

TTAATGTTTG GAAATCCCAA GAAATAGCGC CACCTGAAAC TGTACTAAAG ATATTGCTTA 28 560 

CAATTCCATT TAATAGAGCG ATAATGGCAA TGTATCCGAT TAACATTGCG CCTACAATGA 2 8620 

CAGCTACTTT AAATCCATCT AAAATATATT CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 2 8680 

TTTCTTCAGT TTCTTCAACT AATAATTTGT CATCTTCTTC ATTAACTTTA TAAGGGTTAA 28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 288 60 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 2 8920 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 2 898 0 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 2 904 0 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 29160 

CATCTGCTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 29220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 29280 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC CCAATTAGAA TATGCATATA TTTCTCATTC CTTTAGTTTT 29400 

TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 29460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGTCAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 29580 

TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29G40 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 29820 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 2 9830 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 29940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 3 0000 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 3 0060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 30120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 

TCAAGA 3 0246 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TATTCCCCCA TCGGTTTATT AAATCGTCCA TTTCAATACT GTTTTTCCCC AAGATGTCGA 60 

^ TAAATCCATT TCAAACGCTT GGACGATATC TTGCATCGTA CATACATTAA TTTCATGTCC 120 

TTTTAATAAT GCTAACTTTT CAACTATGTC TGGGTACTTA CGATATAAAT CAACAACTTG IBO 

CTCAAAATCT TTAGAGCCGC TTCGACTACT ACCAATCAAC GTTAATCCTT TTTCAAGTAC 24 0 

10 

TAATCGTGTA TTCACTTCCA CGGGTAATTC ACTTACGCCT AACAAAGCAA TACTGCCTTC 3 00 

TGGTGAAATA TGTTCAACTA TTTGTTGAAG TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

TTCAAATGCA TGATCAATTT TAAGATCATC TGGTATTTGA TTTACTGTAA AGATGTCATC 420 

TACAAATGAA AAATGACTTA ATTTATAGTC TGTCTTACCA AATACATAAG TTTTAGCTTC 4 80 

TGGGTACAAC TTACGTAGCA AAATAGCAGT AATATAACCT AAGTTACCAT CACCCCAAAT 54 0 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT ACGTTCAAAT CGTTGTATAG CATGATAACT 6 00 

TACTGACACT AACTCTGTGT ATGAAATCGT ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

GATACGATCA TGTGCCATCA CAACGTAGTC TTGCATAAAA CCATCATAAC CACTAGATCT 720 

" AAAATAACTA GAGGCTAAGT AATTCTCCGC AATAATATGA TGTTGCTCTG TAGGTGTATT 780 

CGGTACCATT ACTACTTTCG TACCTTTTTC AAATACCCCT TTACTATCAA ATACAACTTC 84 0 

ACCAACAGCT TCATGAACTA ATGACATTGG TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

30 

TCGACCTGTG TAATACCTTT GATCAGCTGC ACAAATAGAC AAGTATAAAG GTCTTACGAT 960 

GACATGATTA CCATAAATAT CAACATTATT ATATGTGACG TCGAACTGTC TCGGTGCAAC 102 0 

GAGTTGATAT ACTTGATTAA TCATCGGCAA TATCACCTTG AATAATGGCA TTTGCTACTT 1080 

35 

TTAAATCATA CGGTGTTGTC ACTTTAATGT TGTATAGTTC TCCaCGTACC AATTTAACTG 114 0 

CATgrCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT TGTTCACTAC 1200 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT TAATATTAAA TGATTGTGGT GTTTGGCCTT 1260 

GATACATTTC ATTCCTTACA GGGATACTGT GTATGTTCTG TTTATCTTTA GACATTACAA 1320 

TCGTATCAAT TGCTTCAATG ACTGTATCTA CTGCACCATA TTTTGCTGCT ACTTCAATGT 1380 

45 TCTCTTTAAT AATACGTTGA GTTAAAAATG GTCTTACGGC ATCATGAGTT ACAATCACAT 1440 

CATCATTATT AATTCCATTT ACATTGCGAA TATGGTCGAT AATGTTCATA ATTGTTTCAT 150 0 

TTCGATCCGT ACCACCTGCA ACTACTTTGA CACGTTGATC TGTAATGTTA TATTTTTTTA 1560 

50 

AAATATCCTG TGTATGGGAA ATCCACTGTG CTGGOGTTGC GATAATAATC TCATTAAATT 1620 

CACTCACTAA AATGAACTTC TCAATTGTAT GGATTAAAAT CGGTTTATTA TCAATATCTA 1680 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1800 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 1860 

5 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 19 80 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 204 0 

ATTTATCTTA TTAAGTCGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

,^ ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 222 0 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 22 80 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2 34 0 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 24 6 0 

ACAGTTCCTU^ GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2 520 

ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 258 0 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2820 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 2 880 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3 000 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 30S0 

AGGTATTGCT GGGCAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 3240 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 33 00 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

50 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 34 80 
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TGCTTTAATA CCTTCGCCGG ATTTTAAATG TTGATACGCC TCGTCCCATT TCGAAATATC 3 6 00 

ATATATTTTT GTCACCAAAG CTTCAGCATT TACTAAACCA TCCGCCATAA GTTGCAATGA 3 660 

^ AGGTTCCCAA TCTGCTGGCT TTTGACTTCT ACTACCAACA ACTGTTATTT CTTTTTGAAT 3 72 0 

CACTTTTTCC ATATCAAATG GAATTTCAGC ATCCTTAAAA ATACCTATTT GACTGTAGAA 3780 

ACCTTTTTTG CGTAAAATAT CCAAACCTTG TCGTGCTGCT GGAACTGCAC CTGAACATTC 3 840 

AACAACAACA TCTGCACCGT AACCGTCTGT AATTCCATTG ATATACGTTT TTAAGTCTGT 3900 

TTGTTGTAAA TTGACTACAT AATCCATGTG CAATGCTTCT GCTTTATCTA ATCTGACTTT 3 960 

GTCATTGTCC AATCCAGTTA CCACAACAGT TGCGCCTTTA CTTTTTAACA CTTGTGCTAC 4020 

15 

AAGTAATCCG ATTGGCCCAG GTCCCATTAC AACTGCTACA TCGCCTGAAT TGACTTGAAT 4080 

CTTAGAAACG CCATGATGTG CACATGCTAA TGGTTCTGTC ATAGCTGCAG ACTGATACGA 4140 

2Q TAtTCGTCTG GAATATGATG CAAACTTTCT TCACGTGCAA TGACATAATT AGTAAATGCG 4 2 00 

CCATCAACTT GTGTTCXIAAT ACCTTTTCGA TGGTTGCATA AATTATAGTC TTTTGATTTA 4260 

CAGTATTCAC ACTCATTACA AACATAGAAT GTCGTTTCAG aTGtGACACG GTCACCAACT 4 320 

25 TTAAAATCTT TAACGTCTGC TCCAACTTCA ACGATTTCAC CAGAAAATTC ATGACCTAAT 43 80 

GTCACTGGAA AATTAACTTT ATAATGACCT TCATAAGTAT GAATATCTGT GCCACAAATT 4440 

CCTGCATAAT GTACTTTAAT CTTTACTTTA TCATCTAGCG GTGTTGCAAC TTCTTTATCA 4 500 

^° AGAAGTTCTA AGTTGCCATG TCCTTCTCTT GTTTTTACTA AAGCTTTCAC CACAAACACC 4 560 

TCGATTTTTA ATTGAATAGA CTAAATAGTT TAAAGATAAG ATAGTTAACG ATATTACCAC 4 620 

CTTGATCAAT ACTTGAAATT TCAGATGAAC CTTTTGGCAT TTGTACATTC GTACCTTTCG 4680 

35 

CCATATCTGT GAAAATGGGT GCTACGTCTG TTGCAATATA TAGTGAAATT GCAATCATAA 4740 

TCGTACCCAC AATGACAGAA TGAATAATGT TTCCTCTTGC TGCACCAACA ATAAACGCX3A 4 800 

CAACAAATGG TATCGTTGCT AAGTCACCAA AAGGTAGTAC TTGGTTTCCT GGTAAAATAA 4 860 

CGGCTAATAA AACAGTGATA GGTACTAAAA TTAATGCTGT CGAAATAACT GCTGGATGAC 4 920 

CTAATGCTAC AGCCGCATCC AATCCAATAT AAATTTCACG TTCGCCAAAA CGTTTATTTA 4 98 0 

45 GCCATGTTCT TGCAGACTCT GAAACTGGCA TTAAACCTTC CATTAAGATT TTTACCATTC 504 0 

TAGGCATTAA TACCATTACT GCAGCCATTG ACATTCCTAA ATTAATGATG TCTCCAGGTT 5100 

TGTAACCTGC TAACACACCA ATACCTAAAC CTAAAATTAA GCCGACAAAT ATAGACTCTC 5160 

CAAATGCGCC AAAACGTTTT TGAATTGTTT CAGGATCAGC ATCTAACTTA TTCAGACCGG 5220 

GTACTTTTTG TAACAATTTA ACTAAGTAAA TACCTGGTGC ATAAGAAATT GTACTTCCTG 5280 

55 



441 



EP 0 786 519 A2 



CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 54 00 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAATATCTA 5460 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 5520 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 55 80 

CATCAATCAC ATTCAGACTG ACGCCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 5640 

AATTTTTAAC TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 5700 

AACCAGACCT AAATGCCGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 5820 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CGATACTTTC CGATTGTGAT 58 BO 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT GCAGTCCTTT 5 94 0 

TGAATTACAG tCACTGCGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AGTCATTTAT 6 000 

ATGCAATTTG CATATATTAA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT 6 060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 6120 

CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTGTCA TTGCAGTTGT AACTAATAAA 618 0 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 6240 

ATATTGTGTT CCTTTGCCAT TTCTTCAATT GCATTATTTA CTACTGTTGA CGTTGCAATA 6300 

CCTGCACCAC ACGCTACTAA TACTTGTTTC ATTTTCAATT CCTCCAATTA ATTTTTAGTT 636 0 

ATATTCCAAA TAATCATTGA TTAGTGTTGC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 6420 

CTGCTCCAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 6480 

ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 6540 

TGTTECCATT TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6600 

GTTAATATGT TCX3ACATCTG TATGCX3GTAT AGCGACCGAA CATAGATGCG TTGGTAAACC 6660 

AGTAGCAAAT TCTTTTTCTC TGTCGATGAC TGCATCTTTA AACGTTGACT TCACGAACCC 6720 

ATTTTGAAAT AACACATCTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGCCGACAA 67 80 

ATTGAGCATT ATATTTTCTT TATGCACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 6840 

TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6 900 

CGCTTAATTG CGTCATTGTC TTGCGCCACA TCTCTCAATT GTAGTAACGC TCTTAAGTGT 6 960 

GTCACTTTAT CAACAGCAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 7020 

TATATTGGTT CTTGTAATAT CAACATACTC ATCGCTGTTT TATGTACATG CTTTTCAGAG 7080 
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TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 7200 

TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT 7260 

5 

ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGcGACaA TCGATCCATT AATGGTTGAA 7 320 

ATTGAATTAT AATTGGCAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 73 80 

CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCTTATCA ATGTCAGCCX3 7440 

ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCTGTTGCTCA TTTTCAGATA 7500 

AAGCTACTTT TGTGATAAAT AATTTTTTAT GTGTTAGGAC AAACATTGGT GAAAAGACGA 7560 

15 TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 76 2 0 

TTTCTGGAAA TAAGTTTCGC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTGTG 76 80 

TACACACGAT AATAGCTTTT ATCTTGCCAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 774 0 

20 CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG CAACGATTTT CCGAAATATT 7 800 

CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 7860 

AAGGATTTAT TAATTCATCA CGATCCGTTA AGTTATATTT AATCCTATAA AAAGCAGGCG 7 920 

25 

TTAAATGTAA CAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA ATAAAAGTGA 7 98 0 

TTTGTTCAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT TAAATTCGAT ATGCTATCTG 8040 

ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 8100 

30 

TTTCCGCTTC TGGCAAATCC GGCTCATGTT GCGTCATAAT CTCCGTTGCT TGATATTCTT 8160 

TCGTATCCCT CAAATACTGA TAATTAATAT TTAATGGATT CATCACATGA CCACTTTGAA 82 2 0 

TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC 8280 

GACTCTTCAA AAATTGTTCT ACCTGTTTGA TCTTGTCTTT TTGATATGCG ATATCTTCGA 834 0 

ATff&AAGTT GAGCGCCTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTTTGAT 84 00 

40 CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACT^TT TCATAACCAT 84 6 0 

GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 8520 

TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT TGAAAAATGG TTTAGAGACA 858 0 

45 TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 8640 

AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 8700 

TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 8760 

50 

CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 8820 

CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TCTCTGAGTA 8880 
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TCAATCGTCA 
ATTGTGATGT 
AAATAAAGCA 
TTAACATACA 
CCCTAAGGAC 
TTATTCGTAT 
TGTAGCAAAA 
GCCATAAGTT 
TTCTGGTTGT 
AAAGTGAGTC 
AGGATTATTG 
TACATCTAAA 
AGCATCTACC 
CATCAAAGTA 
TACATTCGTA 
GTTTATAATC 
TTTCGGACCA 
TAACGGCCTA 
GTTCAAGGCG 
TAATATCTCT 
TCCTCGATCA 
GTTATCGATA 
TTTTAACTTT 
TTATCCGATA 
GACTTTTGTC 
TTCAAACAAT 
TCAAAAGCTA 
ACAGCAGTTC 
GTTGGGTATC 



CACCGATGTA 
TTTAACAACA 
ATTGAAATTT 
TATCTCAATC 
GCTTATATCA 
GTACGTAACT 
TAAGCATTGG 
AGTGGACTTT 
AACAAATACA 
TTGTATGTCT 
ATACTTTCAA 
CCAAACACAT 
CAAATATTGA 
ATAGATTCAA 
AGAGGACCTG 
ACATCTGATG 
TCTAATCCAG 
TCCGCACCTT 
TTTGTCGTAT 
AGTGGACTGT 
CAATCCATAA 
GCTTATGTAT 
CTCATATTTT 
TGCCTTATCA 
AACTCATTTT 
TTTTATGAAA 
TATTGAGAAT 
GGCAAGTCCT 
GGTGGTGCAT 



CACACTTTGA 
TTTCAATTAT 
TTGCATATAT 
ATTATCAAAT 
GGCGCCTTAG 
TATGGTCTAT 
CAGGCTTACC 
GATGTTCAAT 
AAATTGTACA 
TCTTAAAGAA 
AGCGTTCAAT 
TTATAGTAAT 
ATTCTGCTGT 
TATGCTCAGC 
TCGCTACAAT 
CTGGCATTGC 
ATTCCCCATG 
TCGCTACTGC 
TCTTGTCAAC 
CAATTGCCCC 
TAATCTTTCT 
ATTTATTTAT 
TGGATACAAA 
ACCTACCTCG 
CACAACAATA 
AATATTTTCA 
AATTAGGAGG 
TTATGTTACC 
TAAGTAATCC 



ACACATATTT 
ATCTATATTT 
TTTTGTGTTT 
TGTCATGACC 
GGTTAACTGT 
C7VAGTTCCAC 
TGTAACATGA 
GTCGATATTA 
AGCATCATGT 
TTGCAATAAT 
CACGTGATCG 
CCCACTTTCA 
AGGCGTCCAA 
GATTCTTGGC 
TGTTACAGGT 
AACTGCTTGA 
TATTTCAGAA 
TATATCTTGG 
TGATTGATTA 
CGCTAAAATT 
TTTCATTTAT 
GTGGTGAATC 
CACTATTTAT 
CTAAAAATAG 
TAAACAGCAA 
TACACAGAAT 
GATGTTGATG 
TATCGCAATC 
AAACACCGTT 



TCAAAATGAG 
TTTGTGATTT 
TGTGTTTTTT 
ATTGTAACCC 
ATCTATTTAA 
ACTTCTTCAA 
TTTAAATCGA 
ACGGGTACCA 
ATTGGACCAC 
TCTACGACGA 
TCGGCTAAAA 
AAAACACGCT 
TTTCCAAATG 
TCACGAATCA 
GTATCACTCG 
OGTGATGGTG 
GCAAAGGCAG 
CGTCCCATAA 
CCTGCGACTG 
AATGCTATTG 
ATATCCACCT 
ATGTTTATTT 
CTATTTTATG 
GATGTCTACA 
TTTATATGAT 
ATATATTGAT 
AAATCTTTAT 
TTACCAGCTG 
AAAGCATACC 



CATGTACATC 
TAATCTTTTA 
TGAAGCATTT 
AATACAAAAA 
TTAAGTATTA 
CATCAACTGC 
CAGCCATAGT 
TTGTAAACAA 
CATCCATATT 
ACTGTGCAAC 
CTTGATGTGT 
TCGCTGCTTC 
TACCACCACC 
ATGCCGTTGC 
TCATCACTTT 
TCGACGGTAG 
CTGGTTTAAT 
TATCCAATAC 
TTGTTACAGC 
CATCATCGTG 
TTCTTAAGTT 
TGAAAAATAG 
GCTTATAAAT 
TATCTATACC 
TGTTACATGA 
ATTAAATTTC 
TTGAAAAAGC 
CAGGTCTATT 
CTATTTTAGA 



9000 
9060 
9120 
9180 
924 0 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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AAATTTACCG GTCATCTTTG CAATTGGTGT CGCAATCGGA TTATCTAGAA GCGATAAAGG 10 BOO 

TACTGCAGGT tTAGctGCGC TGCTCGGTTT CTTAATTATG AACGCAACTA TGAATGGCTT 10B60 

ATTAACTATC ACGGGCACAT TGGCAAAAGA TCAGCTTGCA CAAAATGGAC AAGGCATGGT 10 920 

GCTCGGTATA CAAACGGTTG AAACCGGTGT TTTTGGCGGG ATTATCACAG GTATTATGAC 10 9 BO 

CGCAATACTT CACAACAAAT ATCACAAAGT GGTATTACCA CCGTATTTAG GTTTCTTTGG 11040 

TGGCTCTAGA TTTGTCCCTA TTGTCACAGC ATTTGCCGCA ATCTTTTTAG GTGTATTGAT 11100 

GTTTTTCATT TGGCCAAGCA TACAAGCCGG CATTTATCAT GTTGGTGGAT TTGTAACGAA 11160 

AACAGGTGCC ATCGGTACTT TTGTTTATGG CTTCATCTTA AGATTGTTAG GTCCACTCGG 11220 

TTTACACCAT ATTTTTTACT TACCGTTTTG GCAGACGGCA CTTGGTGGTA CTTTAGAAGT 11280 

CAAAGGGCAC TTAGTTCAAG GTACGCAGAA CATCTTCTTT GCTCAACTTG GTGATCCAGA 11340 

TGTGACGAAG TATTATTCAG GTGTGTCACG CTTTATGTCA GGCCGTTTTA TTACGATGAT 11400 

GTTCGGCTTA TGTGGTGCCX3 CACTTGCAAT TTATCACACA GCTAAACCTG AACATAAAAA 114 60 

AGTTGTCGGC GGTTTAATGT TATCCGCTGC ACTCACTTCA TTTTTAACAG GTATTACCGA 11520 

ACCTTTAGAG TTTAGTTTCT TGTTTGTCGC ACCTATTCTT TATGTAATCC ATGCCTTCTT 115 80 

TGATGGATTA GCATTTATGA TGGCAGACAT TTTCAACATT ACAATTGGTC AAACCTTCAG 11640 

TGGAGGCTTT ATCGATTTCT TACTCTTTGG TGTGCTACAA GGTAATAGTA AAACAAACTA 11700 

CCTATACGTC ATACCTATTG GAATTGTGTG GTTCTGTTTG TATTACATCG TTTTCAGATT 11760 

CTTAATTACG AAATTTAATT TCAAAACACC TGGTCGAGAA GATAAAGCTG CAGCACAACA 11820 

AGTTGAGGCT ACTGAAAGAG CACAAACTAT TGTTGCTGGT TTGGGAGGCA AAGATAACAT 11880 

TGAAATCGTT GACTGTTGTG CAACGAGACT ACGCGTCACA CTTCATCAAA ATGACAAAGT 11940 

CGATAAAGTA TTACTCGAAA GTACTGGTGC CAAAGGTGTA ATCCAGCAAG GCACTGGTGT 12000 

GCAAGTAATT TATGGGCCTC ACGTTACAGT TATCAAAAAT GAAATTGAAG AATTGCTCGG 12060 

GGATTAAGAC TAACCGAAAT ATCAACAGAA CTAATGGCAA CGATGTACGA AGTAAGAAGT 12120 

GACATCGTTG CTTTTATTTT TAATGTTACA TTTGAAGCAT TAAGTTCATC ATGCACTGTA 12180 

GTGAGCCCGC AAATCGCCTC TGCTAGACAA TCATCTTAAT GCTATGATTA AAGCTTAAGT 1224 0 

GCCAGATTTG AATTTAATTT CAACAACGAC TTTCACTACA TTAAAAATAG GGCCACTCGA 12 3 00 

CACATATAGT TGTATCAAAT AGCCCTTTAT ACAATTTTTT GGGTAAGGTT TTACAATTTT 12360 

TGGGATGGTA TAGATTTTAT AAAAAGTTAT TTAAGTTCTT CTGCTTCAGC CATAATATCT 12420 

TTTAATGTTT TAGCTGAATG TGCGAACTTG CTTTGTTCTT CGTCGTTTAA TGGGATTTCT 12480 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 1266 0 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTT3 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12 780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 1284 0 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12 960 

CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13 020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 130 BO 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 1314 0 

CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 132 00 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13320 

TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 80 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13 560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCKSCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13 6 80 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 

GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13 360 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13 920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13 980 

ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14 04 0 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 14280 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: SB: 

GGTATTTTnG GAnGGGTACC TAAAGCAATT CCGGCAAAGG GTnAATCCAG GTACCGAAAT 60 

,5 GGACTTCCCG TTATCGATAA TACCGACATA TATTGTGACA AGTAGATTTT ATGGACATTT 120 

AGGCTTACTT TTACTTGTGA TAATTGCATG TATGTTTACT GGTATTTACC CaTCaATACA IBO 

TATCATTCAA TTATTGATAT ATGTACCGTT TTGTTTTTTC TTAACTGCCt CGGTGACGTT 24 0 

20 ATTAACATCA ACACTCGGTG TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 30 0 

AAGAATATTA TTTTACTTTT CACCAATTTT GTGGCTACCA AAGAACCATG GTATCAGTGG 3 60 

TTTAATTCAT GAAATGATGA AATATAATCC AGTTTACTTT ATTGCTGAAT CATACCGTGC 4 20 

2S 

AGCAATTTTA TATCACGAAT GGTATTTCAT GGATCATTGG AAATTAATGT TATACAATTT 4 BO 

CGGTATTGTT GCCATTTTCT TTGCAATTGG TGCGTACTTA CACATGAAAT ATAGAGATCA 540 

ATTTGCAGAC TTCTTGTAAT ATATTTATAT GACGAAACCC CGCTAACCAT TAATAAATGG 600 

AAGTGGGGTT CATTTTTGTT TATAATTTAA GTAAATAACA TATTAAGTTG GTGTATTATG 660 

AACGTTTTAA TAAAGAAATT TTATCATTTG GTAGTTCGAA TACTTTCTAA AATGATTACG 720 

CCTCAAGTGA TTGATAAACC GCATATCGTA TTTATGATGA CTTTTCCAGA AGATATTAAG 780 

CCTATCATCA AAGCATTAAA TAATTCGTCG TATCAGAAAA CTGTTTTAAC AACACCAAAA 840 

CAACSCGCCTT ATTTATCTGA ACTTAGCGAC GATGTTGATG TGATAGAAAT GACTAATCGA 900 

40 ACATTGGTAA AACAAATTAA GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 960 

TACCTATTGC TAGGTGGATA TAATAAGACT TCTAATCAAC ACATTGTTCA AACGTGGCAT 1020 

GCAAGTGGTG CATTAAAAAA CTTTGGCTTA ACAGATCATC AAGTCGATGT GTCTGACAAG 10 80 

GCAATGGTTC AGCAGTACCG TAAAGTTTAT CAAGCGACGG ATTTTTACTT AGTGGGTTGT 1140 

GAACAAATGT CACAATGTTT TAAACAGTCT TTAGGTGCAA CAGAAGAGCA AATGCTGTAT 1200 

TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 1260 

so 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 1320 

GATAAAGCAG ATAATAGGGC TATTGATAAA GCTTATTTTG AAAAATGTTT ACCAGGATAT 1380 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAGCTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 156 0 

^ TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 162 0 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 1680 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 174 0 

W 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 18 00 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 186 0 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

IS 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 204 0 

20 CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGTG 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 2220 

PS CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTGaCG 2 2 80 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 2 34 0 

ATATACATTT AGTTTCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 24 00 

^° GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 24 5 0 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2 58 0 

35 

TTCJ\ACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTOTCGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 2760 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2880 

45 AATGAAACGT GTAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 294 0 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 3000 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATCCTTGa 3 060 

50 ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA 


CAAGAATTAT 


ATGGTAAAGA 


TGCTAAATAA 


ATTATATAGA 


ACTATCGATA 


3300 


CTAAACGATA 


AATTAACTTA 


GGTTATTATA 


AAATAAATAT 


AAAACGGACA 


AGTTTCGCAG 


3360 


CTTTATAATG 


TGCAACTTGT 


CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 


ATTGATTATC 


ATATGAACAA 


TAAGTGCTAA 


TCCAGCGACA 


AGGCATGTAC 


CACCAATGAT 


3480 


AGTGAATAAT 


GGATGTTCTT 


CCCACATACT 


TTTAGCAACA 


GTATTTGCCT 


TTTGAATAAT 


3540 


TGGCTGATGA 


ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 


GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TTACCATTTA 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


3780 


CCCCATCATA 


TTACGTTGCT 


TCTCGCCACC 


AAGGTTTTTA 


TAGTCTCCTG 


CACCCATGAT 


3840 


AACTTGATTA 


ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT 


AATTTGCTGT 


3900 


ATCACTTGAT 


CCAGTTTTTA AACCATCTGT 


ACCCGGCAAA 


CTCATTTTTG 


CACCTTCCAA 


3960 


TGAAAAGTTG 


AATGTGTAAT ACGTAACTGC 


ATGCGTTGTT 


GGTGCTAACT 


GCTTTGTAAA 


4020 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 


AGTCTCTAGC 


4080 


AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 


ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4200 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTTTTTTGAA 


ACCTTCTTAG 


CTAAAATTAA 


4260 


TGCCGCGGCA 


TTACTAGAAT 


TAGATACTGT 


AATTTGTAAT 


AGGTCTGCGA 


TTGTCCATAC 


4320 


TTGTCCAGGA 


TATAGTTTCG 


TATTACTCAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA 


AAGCTGCCCC 


TTATTTACAG 


CTTCCAATGT 


4440 


TAAGTACATT 


GTCATTAATT 


TAGTCATAGA 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


TTGATACAGT 


AATTGTCCAG 


TTTGACTTAC 


ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT 


ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 


AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TGTTAAACAT 


AAAATGATGA 


TAATAGATAT 


4680 


TAAATTTTTC 


ATAAAGCGTT 


AATCTTCCCT 


TTTCCAATTC 


TTAAATATTC 


CCTAAAAGCA 


4740 


ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4800 


AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA 


ATAATGCAAA 


TATATTACGT 


GTGTAGCTAA 


4860 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 


TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTTCT TATTTAAAAA ACTATCATGG 5160 

^ CCAGTGGGTC TTATCGTTGC AGCTATCACT ATTTCATCAC TAGGGAGCTT AAGTGGACTA 5 220 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TATCAATTGG 528 0 

AATCtAATCG CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 534 0 

to 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG ATTATTTATG CGATACGCTC AGTTTTATGG 5400 

GAGCATATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 54 60 

AGTCGATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTCACAAAA GCTACCTtnAC 5520 

15 

TTATTACCAT CAATCGTTAC ATtAGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 5580 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTCG TTTTaATTAT GATTCCTCTA 5640 

20 GGTCGTATTA TGCAAAAGAT ATCGACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 5700 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 5760 

GAATTAGATA ATGCACATT^ AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 58 20 

?S AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTGCTAAC AATTGCAATT 58 8 0 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 600 0 

^° ACAGATTATA AAAAGGCAGT CGGTGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 6060 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 6120 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 6180 

35 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 6240 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 63 00 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 6 360 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 6420 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTTATCATG 64 80 

45 CAATTTGATG AAGGATATGA CACGCTTGTA GGTGAACGAG GATTGAAACT GTCTGGCGGA 654 0 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 66 00 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 5 660 

50 ACATTGATGG AAGGTAGAAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA S720 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCATTCAGAA 6780 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 6 900 

TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC 696 0 

5 ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 702 0 

ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA TATTAAATAC TGTAGCCACG 7080 

AGTCATAATT CTTCATATTT TACATAGCAA TTTAACTGAT TTTAGAGTCC ACGGTACAGA 714 0 

AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG GGTGCCAAAT 720 0 

GTTTTTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTGCGG TATTATTTTC 7260 

AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 7320 

75 

TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCTIAAAA GCAGTAAGAG GAGCAGCCAA 7380 

TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTGGTACA GGTTTTGCAT TTGCAAGTTT 744 0 

GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT TATTAATAGT 7500 

GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG GAGGTATTGG 7560 

TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCGAG TCATTCTTTG GGATAGAAAT 7620 

25 GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA AACGTATGAA 7680 

TGAAATGCGT GTATTAACAA TCGCAATGAT GTCAATGAGC TCTGTATCGG GAGCTATTGT 774 0 

AGGTGCGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA ACGGCAATTC CACTAAATAT 7 80 0 

30 CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG AGAAAGAAGA 78 60 

TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA CCATTCTTCT CATTCCTTGG 7920 

AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG TTATTAGTTT 798 0 

TGTAGCGTTA GCTGATCTAT TTGATCGTTT TATCAATTTG ATTACAGGAT TGATAGCAGG 8 040 

ATGGXTAGGC ATAAAAGGTA GTTTCGGTTT AAACCAAATT TTAGGTGTGT TTATGTATCC 8100 

ATTTGCGCTA TTACTCGGTT TACCTTATGA TGAAGCGTGG TTGGTAGCAC AACAAATGGC 8160 

40 

TAAGAAAATT GTTACAAATG AATTTGTTGT TATGGGTGAA ATTTCTAAAG ATATTGCATC 8220 

TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTG CAAACTTCTC 8280 

AACGATTGGT ATGATTATCG GTACATTGAA AGGCATTGTT GATAAAAAGA CATCAGACTT 8340 

TGTATCTAAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT TATTAACAGC 8400 

AGCTTTCGTT GGTTTATTTG CATGGTAATA TGTCGAAGAG TGACTATGAT AATACATTTT 84 GO 

50 AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT TGGACTTTTT 8520 

TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA GCATTATAAA 8580 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 

ATGAAAGTAA ATTAAAAAT 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 31096 base pairs 
<B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



8700 
8760 
8779 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GTTGCAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 60 

GCGTAAGTTA aCGGATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 120 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 180 

ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT ATCAATATTT 240 

CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 30 0 

AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA ACCATTATAA AAAATGGAAA 360 

AGCAGAATTA TTAGCGCCAA TGAGTGCTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 420 

TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 4 80 

TGTGGATATA CCTGGTAGTA CATATGTGAT TTTCGGTGGT GGAGTAGCAG CAACAAATCC 540 

AGCAAATGTT GCCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 6 00 

CATTAAATAT CTTGAAGATA TGTATGCAGA AAAAGATGTC ACAGTAGTCA AATCAACACC 660 

AGA;^TTTA GCAGAACAAA TTAAGAAAGC AGATGTATTT ATTTCTACAA TTTTAATTTC 720 

AGGTGCGAAA CCGCCAAAAT TGGTTACTCG TGAGATCGTT AAATCAATGA AAAAAGGTTC 780 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 840 

AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 9 00 

AGGAGCAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 

TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA GCTTCATCAC ATGACCTAGA 1080 

TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 1140 

GAATATTTTA AATATAGCAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 1320 

TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 13 80 

TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 14 40 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAATG CCTGAAACAG CACCACAAGC 15S0 

GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 162 0 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 1680 

ATATGACX3AT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 1740 

TATTTGGAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA TTGCAGGTAT 1800 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA 1860 

TGTTCATGGT ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 1920 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATG AAGTAGTTAA 1980 

ACATTTAGTA GATGAATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 2040 

TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTX3CAAT 210 0 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 216 0 

AGGCGGGAAT GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 2220 

AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 228 0 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG CTATTGTTAT 2340 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG AAGTAACAGG 2400 

AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA TTTGTGCGGG 24 60 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA CGAAGTATAT 2520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCX3CAAT CATTTATTTA 2580 

TTTTCCAGCT AACXSTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA TTAATTTATT 264 0 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2700 

GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2760 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2320 

TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2 880 

TTTATTAGCA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG TTGCGGGGGA 2 94 0 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTCA GTTGGTATCG GTTGTATTAT 3000 
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TGGTAATTTA AATGCAGCTT CAGATACATC AAAAATATTA TTTGGTGAAA ATGGCGGTAA 3120 

GATTATTACA ATCGGTATAT TAATTTCTGT TTATGGTACG ATCAATGGCT ATACTATGAC 318 0 

5 TGGTATGCGC GTACCATATG CAATGGCTGA AAGAAAATTA TTGCCATTTA GCCATTTATT 324 0 

CGCAAAATTA ACAAAATCTG GCGCACCATG GTTTGGCGCA ATTATACAAC TTATAATCGC 3 300 

TATCATCATG ATGTCAATGG GAGCATTTGA TACAATTACA AATATGTTAA TCTTTGTTAT 33 60 

W 

TTGGTTGTTC TATTGTATGT CATTTGTTGC GGTAATAATT TTAAGAAAAC GTGAACCAAA 3420 

TATGGAACGA CCATATAAAG TACCGTTATA TCCGATCATA CCTTTAATTG CTATTTTGGC 34 80 

AGGATCATTT GTATTAATTA ATACACTGTT TACACAATTT ATATTAGCAA TCATTGGAAT 3540 

/5 

TCTAATAACA GCACTTGGTA TACCAGTTTA TTACTATAAA AAGAAACAAA AAGCAGCATA 3600 

AGGTAAGATA ACTAGCATTG AGAATAAATG GATGGACTAC TAATAAATTT AAAGTTTTAC 3660 

ACATTAAAAT CAAAAACCAT TCAATTATTC TATGGAACAG ACAAATTTCT GTTATGGAAT 3720 

TTGTCTGTTT TTCAAAAGTA TAGGGAGGCA AATAGAGATG GAAAAGCCGT CAAGAGAGGC 3 780 

ATTTGAAGGC AATAATAAGT TGTTAATAGG AATTGTTCTA AGTGTAATAA CGTTTTGGCT 3 840 

25 ATTTGCACAA TCATTGGTTA ATGTTGTACC AATACTTGAA GATAGTTTCA ATACAGATAT 390 0 

TGGAACGGTT AATATCGCCG TTAGTATAAC TGCTTTATTT TCAGGAATGT TTGTAGTAGG 3 96 0 

AGCAGGTGGT CTTGCTGATA AATATGGCAG AATTAAACTC ACGAACATTG GTATTATCTT 402 0 

^ PJ^TPlTATTPl GGTTCATTAT TAATCATTAT TTCAAATATT CCTTTATTAC TTATTATAGG 4 08 0 

AAGATTAATT CTiAGGACTTT CAGCAGCATG TATTATGCCT GCAACTTTGT CTATTATTAA 4140 

GTCATATTAC ATTGGGAAAG ATAGACAACG CGCTTTAAGT TATTGGTCAA TTGGCTCATG 4 200 

35 

GGGCGGCTCT GGTGTTTGTT CATTTTTTGG AGGTGCAGTT GCAACGCTTT TAGGTTGGCG 4260 

TTGGATTTTC ATCCTATCAA TTATAATTTC ATTAATTGCA CTGTTTCTTA TTAAAGGCAC 4 320 

ACCTGAAACT AAATCTAAAT CGATTTCTCT AAATAAATTT GACATTAAAG GTCTGGTTCT 4 330 

TTTAGTCATT ATGCTCCTCA GTTTAAATAT TTTAATTACT AAAGGATCAG AATTAGGTGT 4440 

AACCTCACTT CTTTTTATTA CTTTATTAGC TATTGCAATT GGATCTTTTA GTTTATTTAT 4500 

AGTTCTTGAA AAGCGTGCTA CAAATCCTTT AATCGATTTT AAATTATTTA AAAATAAAGC 4 560 

TTACACAGGT GCAACAGCTT CAAACTTTTT GTTAAATGGT GTTGCAGGAA CATTAATAGT 4 620 

AGCCAACACA TTTGTTCAAA GAGGTTTAGG ATATTCTTCA TTGCAAGCAG GAAGTTTATC 46 8 0 

SO AATCACTTAT TTAGTAATGG TACTAATTAT GATTCGTGTT GGTGAAAAGT TACTTCAAAC 474 0 

ACTCGGATGC AAGAAACCAA TGTTAATTGG AACAGGAGTT CTTATTGTCG GAGAATGTCT 4800 

SS 
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ATTCTTTGGT TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 92 0 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT CTGCATTAGG 4 9 80 

TGGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 504 0 

CATTTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGGa ATATTATCaT 510 0 

TCGTTATCAT TTTGtTACTT GTGcCTAAAC mAAACXSACAC TCAATTATGA TAATTGAGAA 516 0 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 5220 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACTCTTTTAC GCTACTTTAT 52 BO 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 5340 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 5400 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 5460 

GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 5520 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 5580 

ATTCGTAAAT ATACAGTTGG TACATTTTCA ACTGTCATTG CGACATTGGT ATTTTTAGGA 5640 

TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 570 0 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5760 

AATTCACAAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 5 82 0 

CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 58 80 

GCATCTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTGCGACAAC ACAACCAGAT 5940 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6 000 

AATAGAGCGG CTCATGTAGA AAATCATGAA GCAAATGTAG TAACAGCTTC AGATTCATCT 6060 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CCGTGAAAAT GCAGATTCTG GCACATTTAA CTATGTAAAA 6180 

GGCATTTTTG ATAAGATTAA TACGTTATTA GGCAGTAATG ATCCAATAAA CAATAAAGAC 6240 

TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATTCGTAC AATGCCTCAA 6300 

CGTCAACAGA CTAGCCGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 63 6 0 

GCAGAGCCTA GATCAGTATC AGACTATCAA AATGCAAATT CATCATATTA TGTTGAAAAT 642 0 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 64 80 

CCATATAATT TACCAACTAC ACCATGGAAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 654 0 

GCTCTTATGA CAGCGAAACA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 6 600 
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GTAGGAAGAA CTGACTTTGT AACAGTTAAT TCAGATGGAA CAAATGTACA ATGGAGTCAT 6 720 

GGAGCAGGAG CAGGTGCAAA TAAACCACTT CAACAAATGT GGGAATATGG AGTAAATGAT 6780 

5 CCTCATCGTT CACATGACTT TAAAATAAGA AATAGAAGTG GCCAAGTAAT ATATGACTGG 6840 

CCAACTGTCC ATATTTATTC TTTAGAAGAT TTATCTAGAG CGAGTGATTA TTTTAGTGAA 6900 

GCTGGAGCGA CACCTGCTAC TAAAGCTTTT GGTAGACAAA ATTTTGAATA TATTAATGGT 696 0 

CAAAAACCTG CTGAATCACC GGGTGTTCCT AAAGTTTATA CTTTCATCGG TCAAGGTGAT 7020 

GCAAGTTATA CAATTTCATT TAAAACACAA GGTCCAACTG TTAATAAATT GTACTATGCA 7080 

GCAGGTGGGC GTGCTTTAGA GTACAATCAA TTATTTATGT ACAGTCAACT ATACGTCGAA 7140 

15 

TCAACGCAAG ACCATCAACA ACGTCTTAAT GGTTTAAGAC AAGTGGTTAA TCGTACATAT 7200 

CGCATAGGTA CAACTAAACG TGTAGAAGTG AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 7260 

TTAGAAAGTA CAAACCTAAA TATAGATGAT TTTGTTGATG ATCCTTTAAG TTATGTTAAG 7320 

20 

ACGCCGAGTA ATAAAGTGTT AGGATTTTAT TCGAATAATG CAAATACTAA TGCTTTTAGA 73 80 

CCGGGTGQAG CCCAACAATT AAATGAATAT CAATTAAGTC AATTATTTAC TGATCAAAAA 744 0 

^5 TTACAAGAAG CAGCAAGAAC TAGAAACCCA ATAAGATTAA TGATTGGTTT CGACTATCCT 7500 

GATGCTTATG GTAATAGTGA AcTTTAGTTC CTGTTAACTT AACGGTATTA CCTGAAATCC 7560 

AACATAATAt TaAATTCTTT AAAAATGACG ATACTCAAAA TATTGCTGAA AAACCATTTT 76 2 0 

30 CAAAACAAGC TGGGCATCCA GTTTTCTATG TATATGCAGG TAACCAAGGG AATGCTTCCG 76 80 

TGAATTTAGG TGGTAGCGTA ACATCTATTC AACCATTACG TATTAATTTA ACAAGTAATG 774 0 

AGAATTTTAC AGATAAAGAT TGGCAAATTA CAGGTATTCC GCGTACATTA CACATTGAAA 780 0 

ACTCGACAAA TAGACCTAAT AATGCCAGAG AACGCAATAT TGAACTTGTT GGTAACTTAT 786 0 

TACC5GGGGA TTACTTTGGA ACGATACGTT TTGGACGTAA AGAACAATTA TTCGAAATTC 7920 

GTGTTAAACC ACATACACCA ACAATTACAA CGACAGCTGA GCAATTAAGA GGTACAGCAT 798 0 

40 

TACAAAAAGT GCCTGTTAAT ATTTCGGGAA TACCGTTGGA TCCATCGGCA TTGGTTTATT 804 0 

TAGTTGCACC AACAAATCAA ACTACGAATG GTGGTAGTGA GGCAGATCAA ATACCATCTG 8100 

GTTATACGAT ACTTGCGACT GGTACACCTG ATGGGGTGCA TAATACAATT ACTATACGAC 8160 

45 

CGCAAGATTA TGTTGTATTC ATACCACCTG TAGGTAAACA AATTAGAGCA GTAGTTTATT 8220 

ATAATAAAGT AGTTGCATCT AATATGAGTA ATGCTGTTAC TATTTTGCCA GATGACATTC 8280 

50 CACCAACAAT CAATAATCCT GTTGGAATAA ATGCCAAATA CTATCGAGGC GACGAAkCAA 834 0 

CTTTACAATG GGTGTCTCTG ATAGACATTC TGGTATAAAA AATACAACTA TTACGACATT 8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT TAACAGTGAT ATTACATTTA AAGTGTCAGC 8520 

GACAGaCAAT GTCAATAATA CGACAAATGA TAGTCAATCT AAACATGTTT CAATTCATGT 8 580 

AGGTAAAATT AGTGAAGATG CTCATCCGAT TGTATTAGGA AATACTGAGA AAGTTGTAGT 864 0 

AGTCAATCCG ACTGCTGTAT CTAATGATGA AAAGCAAAGC ATAATTACTG CCTTTATGAA 8700 

TAAAAACCAA AATATAAGAG GATATTTASC ATCAACTGAT CCAGTAACTG TCGATAATAA 8760 

TGGTAATGTC ACATTACATT ACCGTGATGG CTCATCGACA ACGCTTGATG CTACAAATGT 8 820 

GATGACATAC GAACCAGTTG TGAAACCTGA ATACCAAACT GTCAATGCTG CTAAAACAGC 8 880 

AACGGTAACG ATTGCTAAAG GACAATCATT TAGTATTGGT GATATTAAAC AATATTTTAC 8940 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG CACATTTACA AATATTACAT CTGATAGAAC 9000 

TATTCCAACT GCACAAGAAG TTAGTCAAAT GAACGCAGGC ACGCAGTTAT ACCATATAAC 9060 

TGCTACAAAT GCGTATCATA AAGATAGTGA AGACTTCTAT ATTAGTTTGA AAATCATCGA 9120 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT ATATCGTACA TCAACATATG ATTTAACTAC 9180 

TGATGAAATC TCAAAAGTAA AACAAGCATT TATTAATGCA AATAGAGATG TAATTACGCT 9240 

TGCCGAAGGT GATATTTCAG TTACAAATAC ACCTAATGGT GCTAATGTAA GTACTATTAC 93 00 

AGTAAATATT AATAAAGGTC GATTAACGAA ATCATTCGCG TCAAACCTAG CTAATATGAA 93 6 0 

TTTCTTGCGT TGGGTTAATT TCCCACAAGA TTATACAGTG ACATGGACGA ATGCAAAAAT 94 20 

TGCAAACAGA CCAACAGATG GTGGTTTATC ATGGTCTGAT GACCATAAAT CTTTAATTTA 94 SO 

TCGTTATGAT GCTACATTAG GTACTCAAAT TACGACGAAT GATATTTTAA CAATGTTAAA 954 0 

AGCAACAACT ACAGTGCCTG GATTGCGAAA TAACATTACT GGTAATGAAA AATCACAAGC 96 00 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC GACTGGTTAT TCACAATCAA ATGCGACAAC 966 0 

TGATGGTCAA CGTCAATTTA CGTTGAATGG TCAAGTGATT CAAGTGTTAG ACATCATCAA 9720 

CCCTTCAAAC GGTTATGGTG GGCAACCTGT TACAAATTCA AATACTCGTG CAAACCATAG 9780 

TAACTCAACT GTTGTTAACX3 TAAACGAACC GGCAGCTAAT GGTGcTGGCG CATTTACAAT 984 0 

TGACCACGTT GTAAAAAGTA ATTCTACACA TAATGCAAGT GATGCAGTTT ATAAAGCACA 9900 

GTTATACTTA ACGCCATATG GTCCAAAACA ATATGTTGAA CATTTAAATC AAAATACAGG 9960 

AAATACTACT GACGCTATTA ACATTTATTT TGTACCAAGT GACTTAGTGA ATCCAACAAT 10020 

TTCAGTAGGT AATTACACTA ATCATCAAGT GTTCTCAGGT GAAACATTTA CAAATACTAT 10 0 80 

TACAGCGAAT GATAACTTTG GTGTGCAATC TGTAACTGTA CCAAATACAT CACAAATTAC 1014 0 

AGGTACTGTT GATAATAACC ATCAACATGT TTCTGCAACG GCACCAAATG TGACATCAGC 10200 
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GTTCAATGTA ACAGTGAAAC CTTTGCGTGA TAAATATCGA GTTGGTACTT CATCAACGGC 10320 

TGCTAATCCT GTGAGAATTG CCAATATTTC GAATAATGCG ACAGTATCAC AAGCTGATCA 10380 

5 AACGACAATT ATTAATTCGT TAACGTTTAC TGAAACAGTA CCAAATAGAA GTTATGCAAG 10440 

AGCAAGTGCG AATGAAATCA CTAGTAAAAC AGTTAGTAAT GTCAGTCGTA CTGGAAATAA 10500 

TGCCAATGTg cACAGTAACT GTTACTTATC AAGATGGAAC AACATCAACA GTGACTGTAC 10560 

'0 CTGTAAAGCA TGTCATTCCA GAAATCGTTG CACATTCGCA TTACACTGTA CAAGGCCAAG 10620 

ACTTCCCAGC AGGTAATGGT TCTAGTGCAT CAGATTACTT TAAGTTATCT AATGGTAGTG 10680 

ACATTGCAGA TGCAACTATT ACATGGGTAA GTGGACAAGC GCCAAATAAA GATAATACAC 10740 

IS 

GTATTGGTGA AGATATAACT GTAACTGCAC ATATCTTAAT TGATGGCGAA ACAACGCCGA 10800 

TTACGAAAAC AGCAACATAT AAAGTAGTAA GAACTGTACC GAAACATGTC TTTGAAACAG 10860 

CCAGAGGTGT TTTATACCCA GGTGTTTCAG ATATGTATGA TGCGAAACAA TATGTTAAGC 10920 

CAGTAAATAA TTCTTGGTCG ACAAATGCGC AACATATGAA TTTCCAATTT GTTGGAACAT 10980 

ATGGTCCTAA CAAAGATGTT GTAGGCATAT CTACTCGTCT TATTAGAGTG ACATATGATA 11040 

2s ATAGACAAAC AGAAGATTTA ACTATTTTAT CTAAAGTTAA ACCTGACCCA CCTAGAATTG 11100 

ACGCAAACTC TGTGACATAT AAAGCAGGTC TTACAAACCA AGAAATTAAA GTTAATAACG 11160 

TATTAAATAA CTCGTCAGTA AAATTATTTA AAGCAGATAA TACACCATTA AATGTCACAA 1122 0 

30 ATATTACTCA TGGTAGCGGT TTTAGTTCGG TTGTGACAGT AAGTGACGCG TTACCAAATG 112 80 

GCGGAATTAA AGCAAAATCT TCAATTTCAA TGAACAATGT GACGTATACG ACGCAAGACG 11340 

AACATGGTCA AGTTGTTACA GTAACAAGAA ATGAATCTGT TGATTCAAAT GACAGTGCAa 1140 0 

CAGTAACAGT GACACCACAA TTACAAGCAA CTACTGAAGG CGCTGTATTT ATTAAAGGTG 11460 

GCGACGGTTT TGATTTCGGA CACGTAGAAA GATTTATTCA AAACCCGCCA CATGGGGCAA 11520 

CGGTTGCATG GCATGATAGT CCAGATACAT GGAAGAATAC AGTCGGTAAC ACTCATAAAA 11580 

40 

CTGCGGTTGT AACATTACCT AATGGTCAAG GTACGCGTAA TGTTGAAGTT CCAGTCAAAG 11640 

TTTATCCAGT TGCTAATGCA AAGGCGCCAT CACGTGATGT GAAAGGTCAA AATTTGACTA 11700 

^ ATGGAACGGA TGCGATGAAC TACATTACAT TTGATCCAAA TACAAACACA AATGGTATCA 11760 

CTGCAGCATG GGCAAATAGA CAACAACCAA ATAACCAACA AGCAGGCGTG CAACATTTAA 11820 

ATGTCGATGT CACATATCCA GGTATTTCAG CTGCTAAACXJ AGTTCCTGTT ACTGTTAATG 11880 

so TATATCAATT TGAATTCCCT CAAACTACTT ATACGACAAC GGTTGGAGGC ACTTTAGCAA 11940 

GTGGTACGCA AGCATCAGGA TATGCACATA TGCAAAATGC TACTGGTTTA CCAAC7VGATG 12 000 
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T3AATAAACC GAATGTGGCT AAAGTCGTTA ACGCAAAATA TGACGTCATC TATAACGGAC 12120 

ATACTTTTGC AACATCTTTA CCAGCGAAAT TTGTAGTAAA AGATGTGCAA CCAGCGAAAC 12180 

CAACTGTGAC TGAAACAGCG GCAGGAGCGA TTACAATTGC ACCTGGAGCA AACCAAACAG 12240 

TGAATACACA TGCCGGTAAC GTAACGACAT ACGCTGATAA ATTAGTTATT AAACGTAATG 123 00 

GTAACGTTGT GACX3ACATTT ACACGTCGCA ATAATACGAG TCCATGGGTG AAAGAAGCAT 123 60 

CTGCAGCAAC TGTAGCAGGT ATTGCTGGAA CTAATAATGG TATTACTGTT GCAGCAGGTA 124 20 

CTTTCAACCC TGCTGATACA ATTCAAGTTG TTGCAACGCA AGGAAGCGGA GAGACAGTGA 12480 

GTGATGAGCA ACGTAGTGAT GATTTCACAG TTGTCGCACC ACAACCGAAC CAAGCGACTA 1254 0 

CTAAGATTTG GCAAAATGGT CATATTGATA TCACGCCTAA TAATCCATCA GGACATTTAA 12S00 

TTAATCCAAC TCAAGCAATG GATATTGCTT ACACTGAAAA AGTGGGTAAT GGTGCAGAAC 12660 

ATAGTAAGAC AATTAATGTT GTTCGTGGTC AAAATAATCA ATGGACAATT GCGAATAAGC 12720 

CTGACTATGT AACGTTAGAT GCACAAACTG GTAAAGTGAC GTTCAATGCC AATACTATAA 1278 0 

AACCAAATTC ATCAATCACA ATTACTCCGA AAGCAGGTAC AGGTCACTCA GTAAGTAGTA 1284 0 

ATCCAA.GTAC ATTAACTGCA CCGGCAGCTC ATACTGTCAA CACAACTGAA ATTGTGAAAG 12 900 

ATTATGGTTC AAATGTAACA GCAGCTGAAA TTAACAATGC AGTTCaAGTT GCTAATAAAC 12 960 

GTACTGCAAC GATTAAAAAT GGCACAGCAA TGCCTACTAA TTTAGCTGGT GGTAGCACAA 13 020 

CGACGATTCC TGTGACAGTA ACTTACAATG ATGGTAGTAC TGAAGAAGTA CAAGAGTCCA 130 80 

TTTTCACAAA AGCGGATAAA CGTGAGTTAA TCACAGCTAA AAATCATTTA GATGATCCAG 13140 

TAAGCACTGA AGGTAAAAAG CCAGGTACAA TTACGCAGTA CAATAATGCA ATGCATAATG 13200 

CGCAACAACA AATCAATACT GCGAAAACAG AAGCACAACA AGTGATTAAT AATGAGCGTG 13260 

CAA^CCACA ACAAGTTTCT GACGCACTAA CTAAAGTTCG TGCAGCACAA ACTAAGATTG 13320 

ATCAAGCTAA AGCATTACTT CAAAATAAAG AAGATAATAG CCAATTAGTA ACGTCTAAAA 13380 

ATAACTTACA AAGTTCTGTG AACCAAGTAC CATCAACTGC TGGTATGACG CAACAAAGTA 13440 

TTGATAACTA TAATGCGAAG AAGCGTGAAG CAGAAACTGA AATAACTGCA GCTCAACGTG 13500 

TTATTGACAA TGGCGATGCA ACTGCACAAC AAATTTCAGA TGAAAAACAT CGTGTCGATA 13560 

ACGCATTAAC AGCATTAAAC CAAGCGAAAC ATGATTTAAC TGCAGATACA CATGCCTTAG 13620 

AGCAAGCAGT GCAACAATTG AATCGCACAG GTACAACGAC TGGTAAGAAG CCGGCAAGTA 13 6 80 

TTACTGCTTA CAATAATTCG ATTCGTGCAC TTCAAAGTGA CTTAACAAGT GCTAAAAATA 13740 

GCGCTAATGC TATTATTCAA AAGCCAATAA GAACAGTACA AGAAGTGCAA TCTGCGTTAA 13800 
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CTGATAATAG 


TGCTTTAAAA 


ACTGCTAAGA 


CGAAACTTGA 


TGAAGAAATC 


AATAAATCAG 


13920 




TAACTACTGA 


TGGTATGACA 


CAATCATCAA 


TCCAAGCATA 


TGAAAATGCT 


AAACGTGCGG 


13980 


5 


GTCAAACAGA 


ATCAACAAAT 


GCACAAAATG 


TTATTAACAA 


TGGTGATGCG 


ACTGACCAAC 


14040 




AAATTGCCGC 


AGAAAAAACA 


AAAGTAGAAG 


AAAAATATAA 


TAGCTTAAAA 


CAAGCAATTG 


14100 




CTGGATTAAC 


TCCAGACTTG 


GCACCATTAC 


AAACTGCAAA 


AA.CTCAGTTG 


CAAAATGATA 


14160 




TTGATCAGCC 


AACGAGTACG 


ACTGGTATGA 


CAAGCGCATC 


TATTGCAGCA 


TTTAATGAAA 


14220 




AACTTTCAGC 


AGCTAGAACT 


AAAATTCAAG 


AAATTGATCG 


TGTATTAGCC 


TCACATCCAG 


14280 


75 


ATGTTGCGAC 


AATACGTCAA 


AACGTGACAG 


CAGCGAATGC 


CGCTAAATCA 


GCACTTGATC 


14340 




AAGCACGTAA 


TGGCTTAACA 


GTCGATAAAG 


CGCCTTTAGA 


AAATGCGAAA 


AATCAACTAC 


14400 




AACATAGTAT 


TGACACGCAA 


ACAAGTACAA 


CTGGTATGAC 


ACAAGACTCT 


ATAAATGCAT 


14460 


20 


ACAATGCGAA 


GTTAACAGCT 


GCACGTAATA 


AGATTCAACA 


AATCAATCAA 


GTATTAGCAG 


14520 




GTTCACCGAC 


TGTAGAACAA 


ATTAATACAA 


ATACGTCTAC 


AGCAAATCAA 


GCTAAATCTG 


14580 




ATTTAGATCA 


TGCACGTCAA 


GCTTTAACAC 


CAGATAAAGC 


GCCGCTTCAA 


ACTGCGAAAA 


14640 


25 


CGCAATTAGA 


ACAAAGCATT 


AATCAACCAA 


CGGATACAAC 


AGGTATGACG 


ACCGCTTCGT 


14700 




TAAATGCGTA 


CAACCAAAAA 


TTACAAGCAG 


CGCGTCAAAA 


GTTAACTGAA 


ATTAATCAAG 


14760 




TGTTGAATGG 


CAACCCAACT 


GTCCAAAATA 


TCAATGATAA 


AGTGACAGAG 


GCAAACCAAG 


14820 


30 


CTAAGGATCA 


ATTAAATACA 


GCACGTCAAG 


GTTTAACATT 


AGATAGACAG 


CCAGCGTTAA 


14880 




CAACATTACA 


TGGTGCATCT 


AACTTAAACC 


AAGCACAACA 


AAATAATTTC 


ACGCAACAAA 


14940 




TTAATGCTGC 


TCAAAATCAT 


GctGCGCTTG 


AAACAATTAA 


GTCTAACATT 


ACGGCTTTAA 


15000 




ATACTGCGAT 


GACGAAATTA 


AAAGACAGTG 


TTGCGGATAA 


TAATACAATT 


AAATCAGATC 


15060 




AAAAJTACAC 


TGACGCAACA 


CCAGCTAATA AACAAGCGTA 


TGATAATGCA 


GTTAATGCGG 


15120 


40 


CTAAAGGTGT 


CATTGGAGAA ACGACTAATC 


CAACGATGGA 


TGTTAACACA 


GTGAACCAAA 


15180 


AAGCAGCATC 


TGTTAAATCG 


ACGAAAGATG 


CTTTAGATGG 


TCAACAAAAC 


TTACAACGTG 


15240 




CGAAAACAGA 


AGCAACAAAT 


GCGATTACGC 


ATGCAAGTGA 


TTTAAACCAA 


GCACAAAAGA 


15300 


4S 


ATGCATTAAC 


ACAACAAGTG 


AATAGTGcAC 


AAAACGTGCA 


AGCAGTAAAT 


GATATTAAAC 


15360 




AAACGACTCA 


AAGCTTAAAT 


ACTGGTATGA 


CAGGTTTAAA 


ACGTGGCGTT 


GCTAATCATA 


15420 




ACCAAGTCGT 


ACAAAGTGAT 


AATTATGTCA 


ACGCAGATAC 


TAATAAGAAA 


AATGATTACA 


15480 


SO 


ACAATGCATA 


CAACCATGCG 


AATGACATTA 


TTAATGGTAA 


TGCACAACAT 


CCAGTTATAA 


15540 




CACCAAGTGA 


TGTTAACAAT 


GCTTTATCAA 


ATGTCACAAG 


TAAAGAACAT 


GCATTGAATG 


15600 
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ATTTAAATAA 
ATGCAGTTAA 
GACAAGCTGT 
CAGCTAAACA 
CAACAAATCC 
ATAAAAATGC 
CAATTGATGC 
ATGCTGCATC 
CAJcCGATGGg 
ACTATCAAGA 
AAGATATTTT 
TGAATCAAGT 
nCAAaCAGCA 
TTTAACAAAC 
TGCCAATACA 
GACTAAAGCA 
CGCAGTAGCT 
TACGATTACA 
AAACTTAGCT 
AGATGCTCAA 
TGATACTGTA 
TATTAACAAC 
ACAACAAGAG 
TCCAAACACT 
TGCATTGAAT 
TACTTTAACG 
TACAAACTTA 
GGGTAACTTA 
GGATGCTGAT 



TGCACAACGT 
TACAATTAAG 
TGCAGATAAA 
AAATGCATAT 
AACGATGTCT 
ATTAAATGGT 
ATTACCACAT 
AAATATTGCT 
TAACTTGCAA 
TGCGACACCT 
AAATAAATCA 
GAATTCTGCT 
AAACAGCAGT 
CAAATTAATA 
TTAGATCAAG 
AGTGAAGATT 
GCTGCTGAAA 
CAAAAAGCAG 
GCTGCAAAAC 
AAGAACAATT 
AAACAAAATG 
GAATCTCAAG 
TATGATAATG 
GCGCAAAATG 
GGTGATGCAA 
CATATCACTA 
GCTGGTGTTG 
CAAACGGCTA 
GAGCAAAAAC 



CAAAACTTAC 
CAAAATGCAA 
GATCAAGTGA 
AACAGTGCAG 
GTTGATGATG 
TATGAAAAAT 
TTAAATAATG 
GGCGTAAATA 
GGTGCAATCA 
AGTAAGAAAA 
AATGGTCAAA 
AAAAATAACT 
TAAATAATAT 
GTGGTACTAC 
CCATGAATAC 
ACGTAGATGC 
CGATTATTAA 
AGCAAGTGAA 
AAAATGCGAA 
TGATTAGTCA 
CGCAACATCT 
TGAAATCATC 
CTATTACTGC 
CAGTTGAAGC 
AATTAATTGC 
CAGCTCAACG 
AATCTGTTAA 
TCAACGATAA 
GTAATGCATA 



AATCGCAAAT 
CAAACTTGAA 
AACGTACAGA 
TTTCAAGTGC 
TTAATCGTGC 
TAGCACAATC 
CACAAAAAGC 
CTGTTAAACA 
ATGATGAACA 
CAGCATACAC 
ATAAAACGAA 
TAGATGGTAC 
GACGCATTTA 
TGTCGCTGGT 
GTTAAGACAA 
TAATAATGAT 
TGCTAATAGT 
TAGTTCTAAA 
AACGTACTTA 
AATTACTAGT 
AGACCAAGCT 
TGAGAAATAT 
AGCGAAAGCG 
AGCATTACAA 
AGCTCAAAAC 
TAATGATTTA 
ACAAAATGCG 
GTCAGGAACA 
CAATCAAGCT 



TAATGGTGCG 
TAGTGCAATG 
AGATTATGCG 
CGAAACAATC 
AACTTCAGCT 
TAAAACAGAT 
AGATGTTAAA 
ACAAGGTACA 
AACGACGCTT 
AAATGCGGTA 
AGATCAAGTT 
GCGTTTATTA 
ACAACTGCAC 
GTTCAAACGG 
AGTATTGCCA 
AAGCAAACAG 
AATCCAGAAA 
ACGGCACTTA 
AACACATTGA 
GCGACAAGAG 
ATGGCTAGCT 
CGTGATGCTG 
ATTTTAAATA 
CGTGTTAATA 
GCAGCGAAAC 
ACAAATCAAA 
AATAGTTTAG 
TTAGCGAGCC 
GTATCAGCAG 



CATCAAATTG 
GGTAACTTAA 
GATGCAGATA 
ATTAATCAAA 
GTTACTTCTA 
GCTGCAAGAG 
TCTAAAATTA 
GATTTAAATA 
AATAGTCAAA 
CAAGCTGCGA 
ACTGAAGCGA 
GATCAAGCGA 
AAAAAACGAA 
TTCAATCAAA 
ACAAAGATGC 
CATATAACAA 
TGAATCCAAG 
ACGGTGATGA 
CAAGTATTAC 
TGAGTGGTGT 
TACAGAATGG 
ATACAAATAA 
AATCGACAGG 
ATGCGAAAGA 
AACATTTAGG 
TTTCACAAGC 
ATGGTGCTAT 
AAAACTTCTT 
CCGAAACCAT 



15720 
15780 
15840 
15900 
15960 
16020 
160S0 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
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TGTTAATAAT GCGAAACATG CATTAAATGG TACGCAAAAC TTAAACAATG CGAAACAAGC 17520 

AGCGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17580 

AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGCGAC 17640 

TCAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 17760 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 1782 0 

AACAGCTGCA GCTAATCAAG TAAACAGCGC GAAACAAGAA TTAAATGGTG ACGAAAGATT 17880 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 1794 0 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACGTACAAAC 1800 0 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 18 060 

TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 18120 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATGCGAC 18180 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18240 

AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 18300 

ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC GTGCAGGTCA 183 60 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 1842 0 

CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 18480 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18 540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 186 00 

AAGTGCTAAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATGCAGCTAA 18660 

AAATSCATTA AATAACTT/IA CGTCAATTAA TAATGCACAA AAACGTGACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 18780 

GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA CACTAGCAAG 18 840 

TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG CCGTTACGAA 18900 

CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18960 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 19 020 

AGCTAAATCA AATGCAAACA CTACTATAAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG ATACTGTTAA 19140 

ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT GTTGATAGTG CTAATGGTGT CATTAATGCA ACAAGCAATC CAAATATGGA 19320 
TGCTAATGCA ATTAACCAAA TCGCTACACA AGTGACATCA ACGAAAAATG CATTAGATGG 19380 
^ TACACATAAT TTAACGCAAG CGAAACAAAC AGCAACAAAT GCCATCGATG GTGCTACTAA 19440 

CTTAAATAAA GCGCAAAAAG ATGCGTTAAA AGCACAAGTT ACAAGTGCGC AACGTGTTGC 19 500 
AAATGTAACA AGTATCCAAC AAACTGCAAA TCAACTTAAT ACAGCTATGG GTCAATTACA 19560 
ACATGGTATT GATGATGAAA ATGCAACAAA ACAAACTCAA AAATATCGTG ACGcTGAACA 19520 
AAGTAAGAAA ACTGCTTATG ATCAAGCTGT AGCTGCTGCG AAAGCAATTT TAAATAAACA 19680 
AACAGGTTCA AATTCAGATA AAGCAGCAGT TGACCGTGCA TTACAACAAG TAACAAGTAC 19740 

15 

GAAAGATGCA TTGAATGGTG ATGCAAAACT GGCAGAAGCG AAAGCGGCAG CTAAACAAAA 19800 

CTTAGGCACT TTAAACCATA TTACGAATGC ACAACGTACT GACTTAGAAG GCCAAATCAA 198 GO 

TCAAGCGACG ACTGTTGATG GCGTTAATAC TGTAAAAACA AATGCCAATA CATTAGACGG 19920 

CGCAATGAAT AGCTTACAAG GTTCAATCAA TGATAAAGAT GCGACATTAA GAAATCAAAA 199 80 

TTATCTTGAT GCGGATGAAT CAAAACGAAA TGCATATACG CAAGCTGTCA CAGCGGCTGA 20040 

25 AGGCATTTTA AATAAACAAA CTGGTGGTAA CACATCTAAA GCAGACGTTG ATAATGCATT 2 0100 

AAATGCAGTT ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 20160 

AACTTCAGCA ACAAATACGA TTGATGGTTT ACCTAACTTA ACACAATTAC AAAAAGACAA 2 0220 

^° CTTGAAGCAT CAAGTTGAaC AAGCGCAAAA TGTAGCAGGT GTAAATGGTG TTAAAGATAA 20280 

AGGTAATACG TTAAATACTG CCATGGGTGC ATTACGTACA AGTATCCAAA ATGATAATAC 20340 

GACGAAAACA AGTCAAAATT ATCTTGATGC ATCTGACAGC AACAAAAATA ATTACAATAC 20400 

35 

TGCTGTAAAT AATGCAAATG GTGTTATTAA TGCAACGAAC AATCCAAATA TGGATGCTAA 20460 

TGCGATTAAT GGCATGGCAA ATCAAGTCAA TACAACTW^ GCAGCGTTAA ATGGTGCACA 20520 

AAACTTAGCT CAAGCTAAAA CAAATGCGAC GAACACAATT AACAACGCAC ATGACTTAAA 20580 

40 

CCAAAAACAA AAAGATGCAT TAAAAACAC7V AGTTAACAAT GCACAACGTG TATcTGATGC 20640 

AAATAACGTT CAACACACTG CAACTGAATT GAACAGTGCG ATGACAGCAC TTAAAGCAGC 2 07 00 

TATTGCTGAT AAAGAAAGAA CAAAAGCAAG CGGTAATTAT GTCAATGCTG ATCAAGAAAA 20760 

ACGTCAAGCG TATGATTCAA AAGTGACTAA CGCTGAAAAT ATCATTAGTG GTACACCGAA 20820 

TGCGACATTA ACAGTCAATG ACGTAAATAG TGCGGCATCA CAAGTCAATG CGGCTAAAAC 208 80 

SO AGCATTAAAT GGTGATAACA ACTTACGTGT AGCGAAAGAG CATGCCAACA ATACAATTGA 20940 

CGGCTTAGCA CAATTGAATA ATGCACAAAA AGCAAAATTA AAAGAACAAG TTCAAAGTGC 21000 
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GAAAGGCTTA AGAGATAGTA TTGCGAATGA AGCAACAATT AAAGCAGGTC AAAACTACAC 21120 
TGACGCAAGT CCAAATAATC GTAACGAGTA CGACAGTGCA GTTACTGCAG CAAAAGCAAT 21180 

5 CATTAATCAA ACATCGAACC CAACGATGGA ACCAAATACT ATTACGCAAG TAACATCACA 2124 0 

AGTGACAACT AAAGAACAGG CATTAAATGG TGCGCGAAAC TTAGCTCAAG CTAAGACAAC 213 00 
TGCGAAAAAC AACTTGAATA ACTTAACATC AATTAACAAT GCACAAAAAG ATGCGTTAAC 21360 

'° GCGTAgcATT GATGGTGCAA CAACAGTAGC TGGTGTAAAT CAAGAAACTG CAAAAGCAAC 2142 0 

AGAATTAAAT AACGCAATGC ATAGTTTACA AAATGGTATC AATGATGAGA CACAAACAAA 214 80 

ACAAACTCAG AAATACCTAG ATGCAGAGCC AAGTAAGAAA TCAGCTTATG ATCAAGCAGT 21540 

IS 

AAATGCAGCG AAAGCAATTT TAACAAAAGC TAGTGGTCAA AATGTAGACA AAGCAGCAGT 216 00 

TGAACAAGCA TTGCAAAATG TGAACAGTAC GAAGACGGCG TTGAACGGTG ATGCGAAATT 21660 

AAATGAAGCT AAAGCAGCTG CGAAACAAAC GTTAGGTACA TTAACACACA TTAATAATGC 21720 

ACAACGTACA GCGTTAGACA ATGAAATTAC ACAAGCAACA AATGTTGAAG GTGTTAATAC 21780 

AGTTAAAGCC AAAGCGCAAC AATTAGATGG TGCTATGGGT CAATTAGAAA CATCAATTCG 21840 

2s TGATAAAGAC ACGACGTTAC AAAGTCAAAA TTATCAAGAT GCTGATGATG CTAAACGAAC 21900 

TGCTTATTCT CAAGCAGTAA ATGCAGCAGC AACTATTTTA AATAAAACAg CTGGCGGTAA 21960 

TACACCTAAA GCAGATGTTG AAAGAGCAAT GCAAGCTGTT ACACAAGCAA ATACTGcATT 22020 

30 AAACGGTATT CAmAACTTAG ATCGTGCGAA ACArGCTGCT AACACAGCGA TTACAAATGC 22 0 80 

TTCGGACTTA AATACAAAAC mAAAAGAAGC ATTAAAAgCA CAAGTAACAA GTGCAGGACG 22140 

TGTATCTGCA GCAAATGGTG TTGAACATAC TGCGACTGAA TTAAATACTG CGATGACAGC 222O0 

TTTAAAGCGT GCCATTGCTG ATAAAGCTGA GACAAAAGCT AGTGGTAACT ATGTCAATGC 22260 

TGATffCGAAT AAACGTCAAG CATATGATGA AAAAGTTACA GCTGCCGAAA ATATCGTTAG 22320 

TGGTACACCA ACACCAACGT TAACACCAGC AGATGTTACA AATGCAGCAA CGCAAGTAAC 22380 

40 

GAATGCTAAG ACGCAGTTAA ACGGTAATCA TAATTTAGAA GTAGCGAAAC AAAATGCTAA 22440 

CACTGCAATT GATGGTTTAA CTTCTTTAAA TGGTCCGCAA AAAGCAAAAC TTAAAGAACA 22500 

AGTGGGTCAA GCGACGACGT TGCCAAATGT TCAAACTGTT CGTGATAATG CACAAAC7VTT 22560 

AAACACTGCA ATGAAAGGTC TACGAGATAG CATTGCGAAT GAAGCAACGA TTAAAGCAGG 22620 

TCAAAACTAC ACAGATGCAA GTCAAAACAA ACAAACTGAC TACAACAGTG CAGTCACTGC 22680 

SO AGCAAAAGCA ATCATTGGTC AAACAACTAG TCCATCAATG AATGCGCAAG AAATTAATCA 22740 

AGCGAAAGAC CAAGTGACAG CTAAACAACA AGCGTTAAAC GGTCAAGAAA ACTTAAGAAC 22800 
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AGATGCAGTG AAACGTCAAA TCGAAGGTGC AACGCATGTT AATGAAGTAA CACAAGCACA 22 920 

AAATAATGCG GATGCaTTAA ATACAGCTAT GACGAACTTG AAAAATGGTA TTCAAGATCA 22 980 

5 GAATACGATT AAGCAAGGTG TTAACTTCAC TGATGCCGAC GAAGCGAAAC GTAATGCATA 23 04 0 

TACAAATGCA GTGACGCAAG CTGAACAAAT TTTAAATAAA GCACAAGGTC CAAATACTTC 23100 

AAAAGACGGT GTCGAAACTG CGTTAGAaAA TGTACTiACGT GCTAAAAACG AATTGAACGG 23160 

10 

TAATCAAAAT GTTGCGAACG CTAAGACAAC TGCGAAAAAT GCATTGAATA ACCTAACATC 23 220 

AATTAATAAT GCACAAAAAG AAGCATTGAA ATCACAAATT GAAGGTGCGA CAACAGTTGC 23280 

AGGTGTAAAT CAAGTGTCTA CAACGGCATC TGAATTAAAT ACAGCAATGA GCAACTTACA 23340 

AAATGGTATT AATGATGAAG CAGCTACAAA AGCAGCGCTT AATGGTACTC AAAACCTTGA 234 00 

AAAAGCTAAA CAACACGCAA ATACAGCAAT TGACGGTTTA AGCCATTTAA CAAATGCACA 234 60 

AAAAGAGGCA TTAAAACAAT TGGTACAACA ATCGACTACT GTTGCAGAAG CACAAGGTAA 23520 

TGAGCAAAAA GCAAACAATG TTGATGCAGC AATGGACAAA TTACGTCAAA GTATTGCAGA 23580 

TAATGCGACA ACAAAACAAA ACCAAAATTA TACTGATGCA AGTCAGAATA AAAAGGATGC 2364 0 

25 GTACAATAAT GCTGTCACAA CTGCACAAGG TATTATTGAT CAAACTACAA GTCCAACTTT 237 00 

AGATCCGACT GTTATCAATC AAGCTGCTGG ACAAGTAAGC ACAACTAAAA ATGCATTAAA 2 3 75 0 

TGGTAATGAA AACCTAGAGG CAGCGAAACA ACAAGCGTCA CAATCATTAG GTTCATTAGA 23 820 

^ TAACTTAAAT AATGCGCAAA AACAAACAGT TACTGATCAA ATTAATGGCG CGCATACTGT 23880 

TGATGAAGCA AATCAAATTA AGCAAAATGC GCAAAACTTA AATACAGCGA TGGGTAACTT 23 940 

GAAACAAGCG ATAGcTGACA AAGATGCTAC GAAAGCGACA GTTAACTTCA CTGATGCAGA 24 000 

35 

TCAAGCAAAA CAACAAGCAT ATAACaCTGC TGTTACAAAT GCTGAAAATA TCATTTCAAA 24 060 

AGCTAATGGC GGCAATGCAA CACAAGCTGA AGTTGAACAA GCAATCAAAC AAGTTAATGC 24120 

TGCAAAACAA GCATTAAATG GTAATGCCAA CGTTCAACAT GCAAAAGACG AAGCAACAGC 24180 

40 

ATTAATTAAT AGCTCTAATG ACCTTAACCA AGCACAAAAA GACGCATTAA AACAACAAGT 24240 

TCAAAATGCA ACTACTGTAG CTGGTGTAAA CAATGTTAAA CAAACAGCAC AAGAGTTAAA 24 3 00 

CAATGCTATG ACACAATTAA AACAAGGCAT TGCAGATAAA GAACAAACAA AAGCTGATGG 24350 

TAACTTTGTC AATGCAGATC CTGATAAGCA AAATGCATAT AATCAAGCAG TAGCGAAAGC 2442 0 

TGAAGCATTA ATTAGTGctA CGCCTGATGT TGTCGTTACA CCTAGCGAAA TTACTGCAGC 244 8 0 

SO GTTAAATAAA GTTACGCAAG CTAAAAATGA TTTAAATGGT AATACAAACT TAGCAACGGC 2454 0 

GAAACAAAAT GTTCAACATG CTATTGATCA ATTGCCAAAC TTAAACCAAG CGCAACGTGA 24600 
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AGCGGCGACA ACGCTTAATG 


ACGCGATGAC 


ACAATTGAAA CAAGGTATTG 


CGAATAAAGC 


24720 




ACAAATTAAA GGTAGCGAGA ACTATCACGA TGCTGATACT GACAAGCAAA CAGCATATGA 


24780 


5 


TAATGCAGTA ACAAAAGCAG 


AAGAATTGTT 


AAAACAAACA ACAAATCCAA 


CAATGGATCC 


24840 




AAATACAATT CAACAAGCAT 


TAACTAAAGT 


GAATGACACA AATCAAGCAC 


TTAACGGTAA 


24900 




TCAAAAATTA GCTGATGCCA 


AACAAGATGC 


TAAGACAACA CTTGGTACAC 


TAGATCATTT 


24960 


10 


AAATGATGCT CAAAAACAAG 


CGCTAACAAC 


TCAAGTTGAA CAAGCACCAG 


ATATTGCAAC 


25020 




AGTTAATAAT GTTAAGCAAA 


ATGCTCAAAA 


TCTGAATAAT GCTATGACTA 


ACTTAAACAA 


25080 


rs 


TGCATTACAA GATAAAACTG 


AGACATTAAA 


TAGCATTAAC TTTACTGATG 


CAGATCAAGC 


25140 


TAAGAAAGAT GCTTATACTA 


ATGCGGTTTC 


ACATGCAGAA GGTATTTTAT 


CTAAAGCAAA 


25200 




TGGCAGCAAT GCAAGTCAAA 


CTGAAGTGGA 


ACAAGCGATG CAACGTGTGA 


ACGAAGCGAA 


25260 


20 


ACAAGCATTG AATGGTAATG 


ACAATGTACA 


ACGTGCAAAA GATGCAGCGA 


AACAAGTGAT 


25320 




TACAAATGCA AATGATTTAA ATCAAGCAAT 


GACACAATTG AAACAAGGTA 


TTGCAGATAA 


25380 




AGACCAAACT AAAGCAAATG 


GTAACTTTGT 


CAATGCTGAT ACTGATAAGC 


AAAATGCTTA 


25440 


25 


CAACAATGCG GTAGCACATG 


CTGAACAAAT 


AATTAGTGGT ACACCAAATG 


CAAACGTGGA 


25500 




TCCACAACAA GTGGCTCAAG 


CGTTACAACA 


AGTGAATCclA GCTAAGGGTG 


ATTTAAACGG 


25560 




TAACCATAAC TTACAAGTTG 


CTAAAGACAA 


TGCAAATACA GCCATTGATC 


AGTTACCAAA 


25620 


30 


CTTAAATCAA CCACAAAAAA 


CAGCATTAAA 


AGACCAAGTG TCGCATGCAG 


AACTTGTTAC 


25680 




AGGTGTTAAT GCTATTAAGC 


AAAATGCTGA 


TGCGTTAAAT AATGcAATGG 


GTACATTGAA 


25740 




ACAACAAATT CAAGCGAACA 


GTCAAGTACC 


ACAGTCAGTT GACTTTACAC 


AAGCGGATCA 


25800 


35 


AGACAAACAA CAAGCATATA 


ACAATGCGGC 


TAACCAAGCG CAACAAATCG 


CAAATGGCAT 


25860 




ACCAACACCT GTATTGACGC 


CTGATACAGT 


AACACAAGCA GTGACAACTA 


TGAATCAAGC 


25920 




GAAAGATGCA TTAAACGGTG 


ATGAAAAATT 


AGCACAAGCG AAACAAGAAG 


CTTTAGCAAA 


25980 


40 


TCTTGATACG TTACGCGATT 


TAAATCAACC 


ACAACGTGAT GCATTACGTA 


ACCAAATCAA 


26040 




TCAAGCACAA GCGTTAGCTA 


CAGTTGAACA AACTAAACAA AATGCACAAA ATGTGAATAC 


26100 


45 


aGCaATGAGT AACTTGAAAC 


aAGGTATTGC 


cLAACAAAGAT ACTGTCAAAG 


CAAGTGAGAA 


26160 




CTATCATGAT GCTGATGCCG 


ATAAGCAAAC 


AGCATATACA AATGCAGTGT 


CTCAAGCGGA 


26220 




AGGTATTATC AATCAAACGA 


CAAATCCAAC 


GCTTAACCCA GATGAAATAA 


CACGTGCATT 


26280 


50 


AACTCAAGTG ACTGATGCTA 


AAAATGGCTT 


AAACGGTGAA GCTAAATTGG 


CAACTGAAAA 


26340 




GCAAAATGCT AAAGATGCCG 


TAAGTGGGAT 


GACGCATTTA AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 2 6580 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 2 6 640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 2 6700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 2 6760 

TX3ATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 268 8 0 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26940 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 2 7120 

TCAATTAGAT CATTTGAATA ATGCGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 2724 0 

GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 27420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 27480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCT^AAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27 600 

TACX3CTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGCflAATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 27720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27 780 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27 840 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27 900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 27960 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACSVACAAA GTATTGCTGA 2 8 020 

CAAAGATGCA ACACGTGOGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 2 8080 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 2814 0 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAATA TCATCTAAAA ATGCATTAGA 28200 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2 8 320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 283 80 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 2 8440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 28500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 2 856 0 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 2 8620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 28740 

CCAAGCAATG GAAGCTTTAC GTAATAGCAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 288 00 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 28860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 28 920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA 2898 0 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATAATC CGCAACGTCA 2904 0 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 2910 0 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 2916 0 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 2 9220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 2 9280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 2 934 0 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 29400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 29460 

GGTTSCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 29520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CXSTCAACTGA 29530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 2964 0 

TGGTTCAAAT GCGAATAAAG ACGCTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 29700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 29760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 29820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 29880 

TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 29940 

CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 3 012 0 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 3018 0 

CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 3 0 240 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 30 3 00 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 303 GO 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 3 0420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 3 0480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 3 0540 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAGCAAT 30600 

TAATAGAGCA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 30660 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 3 0720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG GTAATGCAAT 30780 

TGCaAAAGCT GAAGC7VGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 30840 

TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 30900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 3102 0 

GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 310 8 0 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 60 

TTAGCGATAG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 120 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 180 

TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 24 0 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 3 00 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 80 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 54 0 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 6 00 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 66 0 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 72 0 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 780 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTCtGG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 1020 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 10 80 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 12 0 0 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 126 0 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 13 20 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 1380 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 14 40 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATGSATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 16 80 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 174 0 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 18 00 

AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 18 60 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 198 0 

TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 210 0 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 22 20 

ATAAAATATT TGAATTAGAA GGC 224 3 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

TTGGnATCAT CyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 60 

CAATGTTTGC AATTTGTTGG GTTGCATATA TTCAATGGGA GTCTACAATC GCTTCATTTA 12 0 

CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 18 0 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 24 0 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 300 

CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 420 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 54 0 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 600 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 780 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 840 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 1020 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 1080 

TTCAAGATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 114 0 

CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 1260 
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GTTATATAAC 


AAAGGTTTAG 


CATACGTTGA 


TGAAGTTGCA GTTAACTGGT 


GTCCAGCATT 


138D 




AGGCACTGTT 


TTATCTAACG 


AAGAAGTGAT 


TGATGGTGTC TCTGAACGTG 


GTGGACATCC 


1440 


5 


AGTTTATCGT 


AAGCCGATGA 


AACAATGGGT 


ACTTAAAATC ACAGAATATG 


CAGATCAATT 


1500 




ATTAGCAGAT 


TTAGATGATT 


TAGATTGGCC 


TGAGTCTTTA AAAGATATGC 


AGCGCAATTG 


1560 




GATTGGACGT 


TCTGAAGGGG 


CCAAAGTTTC 


ATTTGATGTA GATAATACGG 


AAGGAAAAGT 


1620 




AGAAGTATTT 


ACGACTAGAC 


CAGATACAAT 


CTATGGTGCA TCATTCTTAG 


TCTTAAGTCC 


1680 




TGAACATGCA 


TTAGTTAATT 


CAATTACAAC 


AGATGAATAT AAAGAAAAAG 


TAAAAGCTTA 


1740 


IS 


TCAAACAGAA 


GCTTCTAAAA 


AGTCAGATTT 


AGAACGTACA GATTTAGCAA AAGATAAATC 


1800 


AGGTGTATTT 


ACTGGTGCAT 


ATGCAACTAA 


TCCTTTATCT GGTGAAAAAG 


TACAAATTTG 


1860 




GATTGCTGAT 


TATGTATTAT 


CAACATATGG 


TACTGGAGCA ATTATGGCAG 


TACCAGCGCA 


1920 


20 


TGATGACAGA 


GATTATGAAT 


TTGCTAAAAA 


GTTTGATTTG CCAATCATTG 


AAGTCATCGA 


1980 




AGGTGGAAAT 


GTTGAAGAAG 


CAGCATACAC 


TGGTGAAGGT AAACATATTA 


ATTCTGGTGA 


2040 




ACTTGATGGT 


TTAGAAAATG 


AAGCGGCAAT 


TACTAAAGCT ATTCAATTAT 


TAGAGCAAAA 


2100 


25 


AGGTGCTGGC 


GAAAAGAAAG 


TTAATTACAA 


ATTAAGAGAT TGGTTATTCA 


GTCGTCAGCX; 


2160 




TTATTGGGGC 


GAACCAATTC 


CTGTCATTCA 


TTGGGAAGAT GGAACAATGA 


CAACTGTTCC 


2220 




TGAAGAAGAG 


CTACCATTGT 


TGTTACCTGA 


AACAGATGAA ATCAAGCCAT 


CAGGGACTGG 


2280 


30 


TGAGTCTCCA 


CTAGCTAATA 


TTGATTCATT 


TGTAAATGTT GTAGATGAAA 


AAACAGGTAT 


2340 




GAAAGGACGT 


CGTGAAACAA 


ATACAATGCC 


ACAATGGGCA GGTAGTTGTT 


GGTATTATTT 


2400 




ACGTTACATC 


GATCCTAAAA 


ATGAAAATAT 


GTTAGCAGAT CCTGAAAAAT 


TAAAACATTG 


2460 




GTTACCTGTT 


GATTTATATA 


TCGGTGGAGT 


AGAACATGCG GTTCTTCACT 


TATTATATGC 


2520 




AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG AACCTTTCCA 


2580 


40 


AAAATTATTT 


AACCAAGGTA 


TGATTTTAGG 


AGAAGGTAAT GAGAAGATGA 


GTAAATCTAA 


2640 




AGGAAATGTA 


ATCAATCCTG 


ATGATATAGT 


ACAGTCTCAT GGTGCAGATA 


CTTTGCGTCT 


2700 




TTACGAAATG 


TTTATGGGAC 


CTTTAGATGC 


TGCAATTGCA TGGAGTGAAA 


AAGGATTAGA 


2760 


45 


TGGGTCTCGT 


CGATTCTTAG 


ATCGCGTATG 


GCGTTTAATG GTAAATGAAG 


ATGGGACATT 


2820 




GAGTTCAAAA 


ATTGTAACTA 


CAAATAATAA 


ATCTTTAGAT AAAGTTTATA 


ACCAAACTGT 


2880 




TAAAAAGGTA 


ACAGAAGACT 


TTGAAACATT 


AGGATTTAAT ACTGCTATTA 


GTCAATTAAT 


2940 


SO 


GGTATTTATT 


AATGAGTGTT 


ATAAAGTTGA 


TGAAGTTTAT AAACCTTACA 


TTGAAGGCTT 


3000 




CGTTAAAATG 


TTAGCACCTA 


TTGCACCACA 


TATOGGTGAA GAATTATGGT 


CAAAATTAGG 


3060 



ss 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 3 24 0 

5 GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 33 00 

TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 33 6 0 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 3420 

w 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 34 80 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 354 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 36 6 0 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 378 0 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 384 0 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3960 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4 020 

TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTAT TACTTTCAAA 4 0 80 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA CTAATCAAGC 4140 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4 200 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG CATTGGTTAA 42 6 0 

35 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 4320 

TTCASCACAA AACAATAATC ATAC7U3ATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 4 380 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGTACCAA CTAAAACAAA 4440 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4500 

TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 560 

^ AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC 4 620 

GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA CAACTGATCC 4 680 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 4740 

SO TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4 800 

CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 860 
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TCGTATACAT GGAACTGATA CGAATGACCA TGGCGATTTT AATGGTATCG AGAAAGCATT 4 980 

AACAGTAAAT CCGAATTCTG AATTAATCTT TGAATTTAAT ACAATGACTA CTAAAAACGG 5040 

TCAAGGCGCA ACAAATGTTA TTATCAAAAA TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

CAAAATTCAA TTTGTACCTA AAAATGACGC AATAACAGAT GCGCGTGGCA TTTATCAACT 52 2 0 

AAAAGATGGT TACAAATACT ATAGCTTTGT TGACTCTATC GGACTTCATT CTGGGTCACA 52 8 0 

TGTTTTTGTT GAAAGACGAA CAATGGATCC AACAGCAACA AATAATAAAG AGTTTACTGT 5 34 0 

AACAACATCA TTAAAGAATA ATGGTAATTC TGGTGCTTCT CTAGATACAA ATGACTTTGT 54 00 

ATATCAAGTT CAATTACCTG AAGGTGTTGA ATATGTGAAC AATTCATTGA CTAAAGATTT 5460 

TCCAAGTAAC AATTCAGGCG TTGATGTTAA TGATATGAAT GTTACATATG ATGCAGCAAA 5520 

TCGTGTGATA ACAATTAAAA GTACTGGAGG AGGTACAGCA AACTCTCCGG CACGACTTAT 5590 

GCCTGATAAA ATACTCGATT TAAGATATAA ATTACGTGTA AATAATGTGC CGACACCAAG 5640 

AACAGTAACA TTTAACGAGA CATTAACGTA TAAAACATAT ACACAAGATT TCATTAATTC 5700 

AGCTGCAGAA AGTCATACTG TAAGTACAAA TCCATATACT ATCGATATCA TCATGAATAA 5760 

AGATGCATTA CAAGCCGAAG TTGACAGACG TATTCAACAA GCTGATTATA CATTTGCGTC 5820 

ATTAGATATC TTTAATGGTC TGAAACGACG CGCACAAACG ATTTTAGATG AAAATCGTAA 588 0 

CAATGTACCA TTAAATAAAA GAGTTTCTCA AGCATATATT GATTCATTAA CTAATCAAAT 594 0 

GCAACATACG TTAATTCGAA GTGTTGATGC TGAAAATGCA GTTAATAAAA AAGTTGACCA 600 0 

AATGGAAGAT TTAGTTAATC AAAATGATGA ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6060 

ACAAGTTATC GAGGAACATA AAAATGAAAT AATTGGTAAT ATTGGTGACC AAACGACTGA 6120 

TGATSGCGTT ACTAGAATCA AAGATCAAGG TATACAGACC TTAAGTGGGG ATACTGCAAC 6180 

ACCGGTTGTT AAACCAAATG CTAAAAAAGC AATACGTGAT AAAGCAACGA AACAAAGGGA 624 0 

AATTATCAAT GCAACACCAG ATGCTACTGA AGACGAGATT CAAGATGCAC TAAATCAATT 63 00 

AGCTACGGAT GAAACAGATG CTATTGATAA TGTTACGAAT GCTACTACAA ATGCTGACGT 6360 

TGAAACAGCT AAAAATAATG GCATCAATAC TATTGGAGCA GTTGTTCCTC AAGTAACTCA 6420 

TAAAAAAGCT GCAAGAGATG CAATTAACCA AGCAACAGCA ACGAAAAGAC AACAAATAAA 64 80 

TAGTAATAGA GAAGCAACTC AGGAAGAGAA AAATGCAGCA TTGAACGAAT TAACTCAAGC 654 0 

AACCAACCAT GCTTTAGAAC AAATCAATCA AGCAACAACA AATGCTAATG TTGATAACGC 6 600 

CAAAGGAGAT GGTCTAAATG CCATTAATCC AATTGCTCCT GTAACTGTTG TTAAGCAAGC 6660 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6780 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6 840 

S TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6 9 60 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCGAAGCA 702 0 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 70 80 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 714 0 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

15 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 73 80 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCX3 GAAGTAGATC AAGCXSCAACA ATTA6GTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATTAGCTGAA ATCAATGCTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 768 0 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 774 0 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7 800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7 860 

35 

GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7920 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8009 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 60 



475 



EP 0 786 519 A2 





AGATGAATGC 


' TAACCATATT CATTCTGCTA AAGATGGTCG TGTTACTGCG ACAGCTGAAA 


130 




TTATTCATCG 


AGGTAAGTCG ACACATGTAT GGGATATAAA AATTAAGAAT GACAAAGAAC 


240 


5 


AATTAATTAC 


AGTTATGCGT GGTACAGTTG CTATTAAACC TTTAAAATAA AAGAACTGCT 


300 




AGCTGAAATG 


TTATGAGATA TTCATAACTA CGGCTAGCAG TTTTTTTATG CGCTATATTG 


360 




TTGTAGTTTT 


AGAAATGCTT GTTCAATGCG TTCGGCAGCT TTACGGCCAC CCATAACATT 


420 




TCTACCAAAT 


GGTCCTAATT CTAAGTCTGC AAAGCATCCT GCGACAAATA GATTTGGTAT 


480 




CCATTCTAAT TTTTCGGAAA TAACAGGGTA ATTACATTCG TTGATAGGTG CATCATAATT 


54 0 


15 


TTGTATTAAT 


TGCTTAATAA GTGGTTGTGA CATAAAATCT TGTTCAAAAC CAGTTGCAAC 


600 




CATAATCTGT 


TGATATGGAA CAGAATCATT TTCAGTGTTA ATTACACCAC CACTAATTTG 


660 




AGTGATAGGT 


GTTTTATGCa CATTTATACG ACCATTTTTA ATATGTTTTT TAAGGCGTAA 


720 


20 


GTACAGTTCG 


TGAGGCATTG ATCCTTTATG ACGTTCGCX3T TGTACAATGG CATTTCTTTC 


780 




AGGCATGCTT 


TTAGTACTTA AAAATGAAGA CATATTTTTC GGACCTAACC AACCAGGATC 


84 0 




AGCATCAAAG 


TCATGTATTT CAATATCTTT ATTTAGCCAT AAATGAATCT TTTTATCGTT 


900 


25 


ATCATGATTT 


AACAATTTAA GTGCAAGATG TGCAGCAGTa ATGCCGCTAC CAACGATATG 


960 




ATCGGTCTTA 


TCATATACTA CTTGATCAAG TTCTTTCTCG AAGATATGAT TTACATTCTG 


1020 




TTTGTCTTTT 


AAAATGTCAG GCATAAACGG AATATTTGTA CTGCCTATTG CAATAACGAC 


1080 


30 


GCAATCTGTA 


GTGATAATTT GTCCATCTTC TAACTTGATA TGCCATTTGT CTTCTTGTTT 


1140 




ATCTAAAGTT 


TGAACTAAAC CTTGAACCAA GCAATCCTCT AATTGATATT GTTTAGAAGC 


1200 




ATGTGCAATA 


TGATCCATAA ACATTGTCAA TTCAGGTCGT TGATAAGGAC CATAAAAAGC 


1260 




ATTTGTATAT 


TGGTGCTGTT TAGCGAATTG TTTTAGATGG AACGGTTGTG GATGTACGTG 


1320 




ATGTACAATC 


GGTGATCTTA AATAAGGCAT TTCTATTCGA TTTGTATATG AGTTAAACCT 


1380 


40 


TTGGCAAAAA 


GTTTCGTGTG GGTCAATGAT TGTTAATCGG TCTGTTGTTA ATCCGCTTGA 


1440 


TAATAGTTTT 


TGTGCGATTG CAGTTCCCTG TATGCCACCG CCGATAATTG TCCAATGCAT 


1500 




AATAAAACCT 


CTCTCTTTTT AAAACGTAAT AGTTACGATT TATAATTATT ATTATCATAA 


1560 


45 


TACATAACGA 


CATX3AAAGGC AATTAAATTA AAGAGATATA TGTAGATAGG GCGAATCTGT 


1620 




AGTCAAAGAA 


AAAATCATTG AAAAAGAGGT AACAATGTCA AAAGAwAACA GCAGTAAAAT 


1680 




CATTCCTAAT 


TTGGAATCAT CTTACTGCTG TTTGTTGTTG ATTTATATTC ATGATTTTGT 


1740 


SO 


TATATAATCT 


ACAATTTTGT GTCTTTTAAG TCTTCCGAAA TTTCATCGAC TTTAGTCTTT 


1800 




TTAGTATAAG 


GCGTTTTAAT ATTATATGCT GCTTTCATAA TCATATGACT TGAAAGAGGA 


1860 
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GCAATAAAAT ATAAAAACGT 


ACCAAATAGT 


AATGACATTG 


CACCTAATGT 


TGATGCTTTT 


1980 




CCGGCAGCAT GTGCACGTGA 


ATATACATCT 


TCAAGTCTCA 


ATAATCCTAT 


AGCTGCTAGG 


2040 


5 


GCGCTAATTA AAGCACCGAT 


GATAACAAAG 


ATAAGTGCAA 


GACTAATCAG 


TATGATTTTG 


2100 




ATCATGTTCA ATCACCTTAC 


CTTTGTCCAT 


AAATTTAGAG 


AATACTGCAG 


TACCTAAAAA 


2160 




AGCTAATATA CCAATCATCA 


TAATAACGAC 


AATCATGTAT 


TTAATATTTA 


ATAAAATACT 


2220 




GAATAATGCT ATAACTGCCA 


TTAATTGAAG 


ACCAATCGGA 


TCTAATGCGA 


CAACACGATC 


2280 




GGCAAGTGAT GGGCCTAGCA 


CAACGCGAAT 


GAGCATAGCT 


AACATAGAAA 


TGACAACTAT 


2340 


15 


GATTAATGCA ATAACGATAA 


TAACATTATG 


ATTCATTATA 


TTTCGCCCAC 


CTCTCTTACA 


2400 


ATTTTCTCTA ATGATGTTTT 


AATACTTTCT 


ACTTCTTGCT 


CTTTAGTTGA 


AAAATCTATG 


2460 




GCATGAATAT AAATTTTTGT 


ACGATCGTCA 


CTTACACCAA 


GCACTACAGT 


ACCAGGTGTT 


2520 


20 


AATGTAATTA AATTAGACAG 


CAAGACAATT 


TGCCAATCTT 


TTTTTAAATC 


TGTGTGATAA 


2580 




ACAAAGAATC CTGGTTCATT 


TTTAATCGAA GGTTTAATAA 


TAATTTTCAA 


AACATCAAAA 


2640 




TTAGCTTTAA TCAGTTCGAT 


TAAGAAAATA 


ATAACTAATT 


TAATAATACG 


ATATAGCGTG 


2700 


25 


ATGACATAAA ATCTACCTGG 


TAACACTCTG 


TGTAAGAGGT 


AAACAAGAAC 


TAGGCCAAAG 


2760 




ATGAAACCTA ACACAAAGTT 


ATTTGTTGTG 


TAACTATTTG 


TCACAAACAA 


CCAAAACACT 


2820 




GCGATAATAA AGTTTAATAC 


TAATTGTACA 


GCCATGTTAT 


TTACCTCCTA 


ATACAGCTTT 


2880 


30 


AACGTAGGTT GATGGATTGT 


AGAATGTTTC 


TGCACCAGCT 


TTTACCATTG 


GATATAAGTA 


2940 




ATCTGCTGAC AATCCATATA 


AAACAGTTAT 


CACAACTGCA 


ACGATTGCAA 


TCGTAGTTAA 


3000 




ATATTTGACG TCGACTTTGT 


TATTAAGATC 


ATATCCTTTT 


GGTTGACCGA 


AAAAGCCTTG 


3060 




TAGGAATATG CGAATGACAG 


AATATAATAC 


GACTAAACTT 


GATAATAAGA 


CGATGACACC 


3120 




ACTTAAATAA AATCCTCTTT 


CAAATGTTGA 


TTGGACAATA 


AAAAATTTTC 


CATAAAAGCC 


3180 


40 


ACTGAGTGGG GGAATGCCAG 


CTAAACTTAA 


TGCTGCGATA 


AAGAATGACC 


AACCAAGTAC 


3240 


AGGATATCGT TTAATTAAGC 


CACCAAATTG 


TCTTAAATCA 


GCAGTGCCTG 


TAATTTTAAT 


3300 




CATAATTCCG ATAAGCAAGA 


ATAATGCAAG 


TTTTACTAAC 


ATGTCGTGCA 


ATGTATAGTA 


3360 


45 


AATAGCCCCA ATCATACCTG 


ACTCTGTCAT 


CATTGCAACG 


CCGACTAAGA 


TCACACCTAC 


3420 




AGCAATCATG ACATTGTATA 


GGATGATTTT 


TTTAATGTTG 


GCATATGCAA 


CAGCACCGAC 


3490 




ACAACCAAAG ATGATCGTTA 


ATAGTGCTAA 


GAATAAAATG 


ACATAATGTG 


AAAAGCTTAC 


3540 


SO 


ATTATCACTA AAGAATAGGC 


TCAATGTTCT 


AGCGATTGCA 


TAAACACCAA 


CTTTTGTTAA 


3600 




CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 


3660 



55 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 3780 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 384 0 

5 GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 3900 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3 9S0 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 4 020 

'° TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4080 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 414 0 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 426 0 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 432 0 

AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 438 0 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 444 0 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4 500 

2S TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4560 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4 620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 46 BO 

30 TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 4 74 0 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4 800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4 8 60 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4 920 

TTTI^CGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4 9 BO 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 5040 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 534 0 

SO CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 54 00 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 54 6 0 



478 



EP0 786 519 A2 





ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT 


AAATAATCTT 


5580 




GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT 


ACAAACTTCG 


5640 


5 


AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA 


GAATATAGTG 


5700 




ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG 


TGAATAATCG 


5760 




GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGCAA TAACATTTGC 


ACTTCTGTTG 


5820 


TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC 


CCAATAACTA 


5880 




AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 


5940 


1S 


GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT 


AATCGAAGag 


6000 




TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA 


TACAATTACT 


6060 




AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG 


TGCTTTTTTA 


6120 


20 


GGTAATTGTT CAGGTTTATA TTGTCCGAAA AATATATGCA TTATAAATTT 


AATTGAATAT 


6180 




ACAAATGTGA AGACACTGCC CACTATACCA ATGATTGGGA ATAGGTAGCC 


TAATGTATCA 


6240 




ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA 


TTCTTTTGAT 


6300 


25 


AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATAA 


CAGTGATTGT 


6360 




AAATGAAATA GGCATAATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC 


GTGTACCAGT 


6420 




AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATGTTG 


CATGGTTGAT 


6480 


30 


TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTGCTATCAT 


CGCCTTGATA 


6540 




GTGATAACTA ATGGCACCGA TTCCAAGCAT CGCCATAATC ATACCTAATT 


GGGATACTGT 


6600 




TGAAAATGCC AGTATACCTT TCAAGTCTTG TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 


6660 




TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG 


CTGCGAAGAT 


6720 




TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG 


AATGAAGATA 


6780 


40 


AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA 


ATGGAAACTG 


6840 


AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA 


AGAATGGGCT 


6900 




ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT 


GTGTTGGTAT 


6960 


45 


AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA 


TTATGAGCGA 


7020 




TTTTTGAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA 


GTAAAAAACT 


7080 




AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG 


AAAGTACGAC 


7140 


SO 


ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA 


GTTGTTCTGA 


7200 




CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTG 


AAATAAGCAA 


7260 
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CCAATTTAAG 


GTTTTCATTA 


CAGTATTACC 


TGACATCGTC 


GTTTTAATTA 


ATGTAAGCAT 


7380 




ATAAATAAAT ATGACGATAG 


GGACAGGTAA 


TACGAACCAT 


CCTAAATGTA 


TACGTTTAAA 


7440 




AAATCTATAC 


AGGATAGGAA 


TAATGAGTGC 


GAATATTAAC 


GGTAATATCA 


CCGCAATATG 


7500 




TAACAAACTC 


ACTATGTTGT 


CCTCCTTTAA 


AAAATATTTA 


TGTTATTCAT 


TATACATGAA 


7560 




TGATATAGTT 


CTGAAAAACG 


TACACACTCC 


TTGTTGTGCT 


TTATTTTCAG 


AaGTATTTAA 


7620 




ATAAGAAGAA 


ACACGTCATT 


TTTTATTTAA 


AATTTTCTTT 


GTATTGAAGT 


GAATAATCTT 


7680 




CTTTTAAGCG 


TGCTAAACTA 


GCTAAAGACA 


TTTCAGCATG 


TTTTGTTTGC 


TGAGCTTTAA 


7740 


TS 


GTTTAGTTTC 


TAAATCTGTA 


ATTGCTTGTT 


GAAGTGAATC 


TTCATAGCGC 


AATACATCAA 


7800 




CATTGAAGTC 


GCGTAATTGT 


GAACGTTTCG 


TATAGCGTTT 


TTCAAAATGG 


CTTAATGCTT 


7860 




TGCGGTCATG 


GAAAAATACA 


CCTTCAGTTT 


CAGTAGGGTT 


ATGTAAATCA 


CCTTGTTTCG 


7920 


20 


GGTGTTTGAT 


AACTTGTTCA 


ACTTTAACAA 


GGACATCGTC 


TCCATTTTCT 


TCAACAATCG 


7980 




TGACACCATA 


GCTACCTGTT 


TTGTGTGAAA 


ATCGATATAG 


CTTCATGCTA 


TTTTCCTCCC 


8040 




TTAAAAGTAT 


GTTAATATAT 


ATGTATCATA 


ACATGAATGG 


AGAATATAAA 


TGGCTAACTA 


8100 


25 


TCCACAGTTA 


AACAAAGAAG 


TACAACAAGG 


TGAAATCAAA 


GTGGTTATGC 


ACACAAATAA 


8160 




AGGTGACATG 


ACATTCAAAT 


TATTTCCAAA 


TATTGCACCA 


AAAACAGTTG 


AAAATTTTGT 


8220 




GACACATGCA 


AAAAATGGTT 


ATTATGATGG 


AATCACATTC 


CACCGTGTCA 


TTAATGACTT 


8280 


30 


CATGATTCAA 


GGTGGCGATC 


CAACAGCTAC 


TGGTATGGGT 


GGCGAAAGTA 


TTTATGGCGG 


8340 




TGCTTTTGAA 


GATGAATTTT 


CATTAAATGC 


ATTTAACTTA 


TATGGCGCAT 


TATCAATCGC 


8400 




TAACTCAGGA 


CCTAATACTA 


ATGGTTCACA 


ATTTTTCATT 


GTTCAAATGA 


AAGAAGTACC 


8460 




TCAAAATATG 


TTAAGTCAAC 


TTGCAGATGG 


TGGCTGGCCT 


CAACCAATCG 


TTGATGCATA 


8520 




TGGCSAAAAG 


GGTGGTACAC 


CATGGTTAGA 


TCAAAAACAT ACAGTATTCG 


GTCAAATCAT 


8580 


40 


TGATGGTGAA 


aCTACATTAG 


AAGATATTGC 


AAATACAAAA 


GTGGGACCAC 


AAGATAAACC 


8640 


ACTTCATGAT 


GTTGTAATTG 


AATCTATTGA 


TGTTGAAGAA 


TAATATCTAA 


ACATAATTAA 


8700 




CTACCAACAT 


TTTAAACTCG 


GATAAAGCTA ATTTATGAAT 


GGATTAGTAT ATATTCCAAC 


8760 


45 


gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 


8820 




TTAAATAGAT 


GTGTATTTTA 


AAGATAATAG 


TTGAGGTTGC 


TTTTTATGTT 


TTTACAGAGA 


8880 




ATTGCTATTC 


AAATAGTAAA 


TAAATTGAAA 


ACAAAGTAGC 


TGGATATCAT 


ATTGATTTAG 


8940 


SO 


ATAGGAATTT 


GTTGCTAATT 


TTATTTGTAA ATCCAAGTTT 


GTAGAATTCT 


TATTCATTTA 


9000 




TAAAATAATA 


TTCGTATGAT 


TTGATTTTTT 


AATTAGTCCA 


CCATTTCGAT 


TTGTGCTATG 


9060 
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AACATATCAA GGTGCGTGTA cTGGTATTCA ACCATACGGT GCGTTTGTTG AGACCCCTAA 9180 

TCATACTGAA GGACTGATTC ATATATCAGA AATTATGGAT GACTACGTTC ATAATTTGAA 924 0 

GAAATTTCTA TCAGAAGGCC AAATTGTTAA AGCTAAAATT TTGTCTATAG ATGATGAAGG 9300 

AAA3CTTAAT CTATCATTAA AGGATAATGA TTACTTCAAA AATTATGAGC GTAAGAAGGA 9360 

AAAACAATCA GTATTAGATG AAATCAGAGA AACAGAAAAA TATGGGTTTC AAACACTTAA 942 0 

AGAACGCTTA CCAATCTGGA TAAAACAGTC AAAGCGAGCA ATTCGAAACG ACTAAAGGAA 94 8 0 

CAGATAAATC GTACCGAAAA TCATACAAAG GGTCTGAAAT GAAAGTTTCT TAGACTATAA 9540 

AAGAGATTAG TATCTATTAA ATTTTATTAG ATACTAATCT CTTTTTGTCT ACGATAACGT 9600 

AATATGaTTG ATTCTATTTA CACGTACAAA TGGTTTAAGG TGACATATCC ATTATCTTTG 966 0 

TTAGATAGAA TCGTTGATTT GCaATATTGT ATGTGGATTT GTTTTTTTTA TTTATTTTAG 9720 

AAATGAGAAC TACAACTTAA AGTATTAAAC GAATTGCAAC TATATAAACA GATAATTGGA 9780 

GAATGAAAAA ATTACATGTT ATAGTCAACT CAATAATTTT AAGGAGGAAT TAAGTAATGA 984 0 

AAAGTAAATA CGAACCATTG TTTGATAAAG TAGAATTACC AAATGGAGTA GAGTTGAGAA 9900 

ATCGATTTGT GTTAGCCCCT TTAACACATA TTTCTTCAAA TGATGATGGT ACTATTTCAG 996 0 

ATGTAGAACT TCCTTATATT GAAAAGCGTT CACAAGATGT TGGTATTACA ATTAATGCTG 1002 0 

CGAGTAATGT GAGTGATGTC GGAAAAGCAT TTCCAGGACA GCCATCAATC GCGCATGACA 10080 

GTAATATTGA AGGACTAAAA CGATTAGCTA CAGCAATGAA GAAAAACGGT GCCAAAGCAC 1014 0 

TCGTACAAAT ACATCATGGC GGTGCACAAG CATTGCCTGA ATTAACACCT GATGGAGACG 102 0 0 

TCGTAGCACC AAGTCCAATT TCTTTAAAAA GTTTTGGTCA GAAACAAGAA CATAGTGCTA 10260 

GAGAAATGAC GAATGAAGAG ATTGAACAAG CAATCAAGGA TTTTGGTGAA GCAACGCGAC 10320 

GTGCftATTGA AGCAGGGTTT GATGGTGTTG AAATACATGG CGCGAATCAT TACTTAATTC 10380 

ATCAATTTGT ATCACCATAC TATAATAGAA GAAATGATGT ATGGGCAAAT CAATATAAAT 1044 0 

TCCCGGTCGC TGTGATTGAA GAAGTACTTA AAGCGAAAGA AGCGTATGGC AATAAAGACT 10500 

TTATAGTTGG ATACAGATTA TCTCCAGAGG AAGCGGAGTC TCCAGGAATC ACAATGGAAA 10560 

TTACAGAGGA ACTCGTTAAT AAAATTAGCC ATATGCCAAT CGACTATATT CATGTTTCAA 10620 

TGATGGATAC GCATGCAACG ACACGTGAAG GTAAATACGC TGGACAAGAA AGACTGCCTT 10680 

TAATTCACAA ATGGATAAAT GGTCGTATGC CACTTATCGG TATTGGTTCA ATTTTCACAG 10740 

CTGACGAAGC TTTAGATGCA GTTGAAAATG TTGGTGTTGA CTTAGTAGCC ATTGGTAGAG 10800 

AGCTACTACT GGATTATCAA TTTGTTGAAA AAATTAAAGA TGGACGGGAA GATGAAATTA 10860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10953 
(2) INFORMATION FOR SEQ ID NO: 63: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 60 

15 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 120 

GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 180 

AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 24 0 

TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 300 

CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 360 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 4 20 

CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 4 80 

AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 540 

^° CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 600 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 660 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 720 

35 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 780 

TACCaAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 840 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 900 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 960 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 1020 

45 CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 1080 

CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 1140 

GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 120 0 

50 ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 1260 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 1320 
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CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 1440 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 1500 

^ TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 156 0 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 1620 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 16 8 0 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 174 0 

TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATACACTGG TACTCAATAT 1860 

15 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 198 0 

GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 2 04 0 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 210 0 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 21G0 

25 TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 2280 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

^° TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 2400 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 2520 

35 

AAATGCTAGC AATACTGGTG AAGGCAGTAT TAATACGACA TTAACATTTG ATGATCCTGC 2580 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 2640 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 2760 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2820 

45 AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2880 

GCAAGGTGCA ACT^TTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2 94 0 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3 000 

SO AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3060 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 3120 
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TAATACTGAA ATCGGTAACA ATC3GTAATTT TGGTGCTTCA TTAAAAGCAG ATCAATTTAA 324 0 

ATATGAAGTA ACATTACCAC AAGGTGTAAC TTACGTTAAT AATTCATTAA CTACAACATT 3300 

S CCCTAATGGT AATGAAGACA GTACAGTATT GAAAAATATG ACTGTTAATT ATGATCAAAA 3 360 

TGCAAATAAA GTTACATTTA CAAGCCAAGG TGTGACAACG GCACGTGGTA CACACACTAA 3 4 20 

AGAAGTTTTA TTCCCAGATA AATCTTTAAA ATTATCATAT AAAGTTAATG TTGCGAATAT 34 80 

CGATACACCT AAAAATATTG ATTTTAATGA AAAATTAACA TATCGTACTG CTTCAGATGT 354 0 

TGTAATTAAT AATGCGCAAC CAGAAGTaCA CTAACTGCAG ATCCATTTTC AGTAGCGGTT 3600 

GAAATGAACA AAGATGCGTT GCAACAACAA GTAAACTCAC AAGTTGATAA TAGTCATTAC 3660 

15 

ACAACAGCAT CAATTGCAGA ATACAATAAA CTTAAACAAC AAGCAGATAC TATTTTAAAT 3720 

GAAGATGCGA ATCATGTTAA AACTGCAAAT CGTGCATCTC AAGCGGATAT TGATGGTTTA 3780 

GTAACTAAAT TACAAGCTGC ATTAATTGAT AATCAAGCAG CAATTGCTGA ATTAGATACT 3 840 

AAAGCTCAAG AAAAGGTTAC AGCAGCACAA CAAAGTAAAA AAGTTACGCA AGATGAAGTT 3900 

GCAGCACTTG TAACTAAAAT TAACAATGAT AAAAATAATG CAATCGCAGA AATTAATAAA 3960 

25 CAAACTACAG CACAAGGTGT CACAACTGAA AAAGATAATG GTATCGCAGT GTTAGAACAA 4020 

GATGTGATTA CACCAACAGT TAAACCTCAA GCGAAACAAG ATATTATCCA AGCAGTTACA 4080 

ACTCGTAAAC AACAAATTAA AAAGTCAAAT GCATCATTAC AAGATGAAAA AGATGTAGCA 414 0 

30 AATGATAAAA TTGGTAAAAT TGAAACAAAG GCAATTAAAG ATATTGATGC AGCAACAACA 4 200 

AATGCACAAG TAGAAGCCAT TAAAACAAAA GCAATCAATG ATATTAATCA AACTACACCT 4 260 

GCTACAACAG CTAAAGCAGC AGCTCTTGAA GAATTTGACG AAGTTGTTCA AGCACAAATT 4 32 0 

GATCTW^GCAC CTTTAAATCC TGATACAACA AATGAAGAAG TAGCGGAAgC TATTGAACGT 43 80 

ATTi^TGCAG CTAAAGTTTC TGGTGTTAAA GCAATTGAAG CGACAACGAC TGCACAAGAT 4440 

TTAGAAAGAG TTAAAAACX3A AGAAATCTCA AAAATTGAAA ATATTACTGA CTCTACGCAA 4 500 

ACAAAAATGG ATGCCTATAA TGAAGTTAAA CAAGCTGCAA CAGCTAGAAA AGCTCAAAAT 4560 

GCTACAGTTT CAAATGCAAC AAATGAAGAA GTAGCAGAAG CTGATGCAGC AGTAGATGCA 4620 

GCTCAAAAGC AAGGTTTACA TGACATCCAA GTTGTTAAAT CAAAACAGGA AGTTGCTGAT 4680 

ACAAAATCAA AAGTATTAGA TAAAATCAAT GCAATTCAAA CACAAGCAAA AGTTAAACCT 4740 

GCAGCTGATA CGGAAGTAGA AAACGCATAT AATACACGTA AACAAGAAAT TCAAAATAGC 4800 

SO AATGCTTCAA CTACAGAAGA AAAACAAGCT GCATATACAG AATTAGATAC TAAAAAGCAA 4860 

GAAGCAAGAA CAAATCTTGA TGCTGCAAAT ACAAACAGTG ATGTAACAAC AGCTAAAGAC 4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 504 0 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

5 GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCAGATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 52 8 0 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 5 34 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 54 00 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 54 60 

15 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5 520 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 5640 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCTVAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 576 0 

25 GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 582 0 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGC7«3CGA TTAATGCGGT AACACCAAAA 58 8 0 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 5 94 0 

30 GTTATCAATA ATGATCAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6 000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6 0 60 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 612 0 

AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 6180 

AATASAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 63 0 0 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 636 0 

GCGAAAGATG AATTAGCAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

^ GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 64 80 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 654 0 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6600 

so GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 672 0 

SS 
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AAAGTTCAAC AACTTCATGC AAATCCTGTT AAGAAACCAG 


CAGGTAAAAA 


AGAATTAGAT 


6840 




CAAGCTGCAG CTGATAAGAA AACACAAATA GAACAAACAC 


CAAATGCATC 


ACAACAAGAA 


6900 


5 


ATTAATGATG CAAAACAAGA AGTTGATACT GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 




CAATCATCAA CAAATGAATA TGTTGATAAT GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 




GCAGTTAAAA CATTTAGTGA GTACAAAAAA GATGCTTTAG 


CTAAAATTGA 


AGATGCATAT 


7080 




AATGCTAAAG TAAACGAAGC GGATAACTCT AACGCATCGA 


CTTCAAGTGA 


AATTGCTGAA 


7140 




GCGAAACAAA AACTTGCTGA ATTAAAACAA ACTGCGGATC 


AAAATGTTAA 


TCAAGCTACT 


7200 




TCTAAAGATG ACATTGAAGT TCAAATTCAT AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


15 


ATTCCAACAG GTAAAAAAGA ATCAGCTACA ACAGATTTAT ATGCTTATGC AGATCAGAAG 


7320 




AAAAATAATA TTTCAGCTGA CACTAATGCA ACACAAGATG 


AAAAGCAACA 


AGCAATTAAG 


7380 


20 


CAAGTTGACC AAAATGTTCA AACTGCATTA GAAAGCATTA ATAATGGTGT GGATAATGGT 


7440 




GACGTTGATG ATGCATTAAC ACAAGGTAAA GCAGCAATTG 


ATGCTATTCA 


AGTAGATGCT 


7500 




ACTGTTAAAC CTAAAGCGAA CCAAGCTATT GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


25 


ATTGATCAAA GTGACCAGTT AACTGCTGAA GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 




CAAATTACAG ATCAAGCTAA ACAAGGTATT ACTGATGCAA 


CAACAACTGC 


TGAAGTTGAA 


7680 




AAAGCGAAAg cTCaAGGACT TGAAGCATTT GATAACATTC AAATCGACTC AACAGAAAAA 


774 0 


30 


CAAAAAGCTA TCGAAGAATT AGAAACTGCA CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 




AACGCTGATG CTACAACTGA AGAAAAAGAA GCGTTTACGA 


ATGCTTTAGA 


AGACATTTTA 


7860 




TCAAAAGCAA CTGaAGATAT TTCTGATCAA ACTACAAATG 


CAGAAATCGC 


TACTGTCAAA 


7920 




AATAGTGCGC TTGAACAACT TAAAGCACAA CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 




TTGGXAGCAA TCAGAGAAGT GGTTAACAAG CAAATAGGAA 


tAATTAAAAA 


TGCAGATGCA 


8040 




GATGCATCGG CGGAAAGAnA TTGCACGTAC GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 




40 


GCTGGATAAA TTTAGGGTnA AACCCCAACC AATGCCGJUiG 


TTGCCTGAAT 


TACCA 


8155 




(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1630 base pairs 









(B) TYPE: nucleic acid 

(C) STRA2TOEDNESS : double 

(D) TOPOLOGY: linear 

so 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 18 0 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 240 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 360 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 42 0 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCX5A TAGAGTCCGT 720 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 780 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 840 

CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 102 0 

GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCTiAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 114 0 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 1200 

TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 12 60 

TCATSAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 1320 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

GTAAATGGAT GATGTGCTGC AACGTAAOST TTCGCATCTT CATCATATTC TAATAATGGC 144 0 

CT^TCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 1500 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 16 30 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 60 

CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 12 0 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 180 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 24 0 

CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 3 00 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 3 60 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 420 

CATGTGCATA TTCATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 480 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 54 0 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCATCTGAT 600 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 66 0 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 72 0 

TACCACAACC CG 732 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

AATATATTCA TATGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 60 

AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 120 

GCTTTTATTT AATTCAAGGT ATGTATCATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 

TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 300 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 360 

ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 420 
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CAACTTTATA 


CATTAAAATA 


ATATCATAAT AAGGATAAAA 


AATAATAGAT 


ATTGATTTTA 


540 




GGGAGATAGT 


AATGAAAAAA 


TTGGTTTCAA 


TTGTTGGCGC 


AACATTATTG 


TTAGCTGGAT 


600 


5 


GTGGATCACA 


AAATTTAGCA 


CCATTAGAAG 


AnAAAACAAC 


AGATTTAAGA 


GAAGATAATC 


660 




ATCAACTCAA 


ACTAGATATT 


CAAGAACTTA 


ATCAACAAAT 


TAGTGATTCT 


AAATCTAAAA 


720 




TTAAAGGGCT 


TGAAAAGGAT 


AAAGAAAACA 


GTAAAAAAAC 


TGCATCTAAT 


AATACGAAAA 


780 




TTAAATTGAT 


GAATGTTACA 


TCAACATACT 


ACGACAAAGT 


TGCTAAAGCT 


TTGAAATCCT 


840 




ATAACGATAT 


TGAGAAAGAT 


GTAAGTAAAA 


ACAAAGGCGA 


TAAGAATGTT 


CAATCGAAAT 


900 


15 


TAAATCAAAT 


TTCTAATGAT 


ATTCAAAGTG 


CTCACACTTC ATACAAAGAT 


GCTATCGATG 


960 




GTTTATCACT 


TAGTGATGAT 


GATAAAAAAA 


CGTCTAAAAA 


TATCGATAAA 


TTAAACTCTG 


1020 




ATTTGAATCA 


TGCATTTGAT 


GATATTAAAA 


ATGGCTATCA 


AAATAAAGAT 


AAAAAACAAC 


lOBO 


20 


TTACAAAAGG 


ACAACAAGCG 


TTGTCAAAAT 


TAAACTTAAA 


TGCAAAATCA 


TGATAGGAGT 


1140 




CTTTTAATGC 


GTAATATAAT 


ATTTTATCTT 


GTACTTATTA 


TTGCTGCGAT 


TGGATTAGTA 


1200 




ATGAATCTAG 


ATGCCTTTAT 


TTTTTCAATC 


GTCAGAATGT 


TAATCAGCTT 


TGcgTAaTAG 


1260 


25 


CTGGTATTAT 


TTATCTGATT 


TATTATTTCT 


TCATCTTAAC 


TGAAGACCAA 


CGCAAATATC 


1320 




GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT 


CAAAGAAGAA 


AATAGATAAA 


AAAACGGAAG 


1380 




CACTTGTAGG 


TAAAATAGTC 


TACGTGCTTC 


CATTTTTTAT 


TCTAAAAACT 


ACTTTCTAAA 


1440 




CATCCATTCA 


TCTGAACGAT 


ATTTTTCAGT 


TAATTCTTCC 


ACTTCTGCCA 


ATTGAGCTTC 


1500 




TGtTAATTCA 


AGTGGCTTTA 


ATTCTATATT 


TAAACCTTTC 


TTAAAACCTT 


TCTCGAAAGC 


1560 




TTCTTCCATT 


TGACTAATAG 


TAATGTGTTC 


ATCTGAAATA 


TCATTGATGG 


CAACTGCTTT 


1620 


35 


TTCAACGAAT 


GCCTCTTTCA 


TTTTTAATTT 


TAATCTTTCA 


TTTTTATAAA 


TrAACATATC 


1680 




AAACAGTTCA 


TCAATATCAA 


TATCTTGTAA 


AATCGAACCG 


TGTTGGAGGA 


TTACGCCCTT 


1740 


40 


TTGTCTCGTT 


TGAGCACTCC 


CAGCAATCTT 


ACGGCCTTCA ACAACTAGCT 


CATACCAACT 


1800 




TGGTGCATCA 


AAACACACTG 


AACTTCGAGG 


TTGTTTTAAT TTTTGACGCT 


CTTCAGGCGT 


1860 




TTTAGGTACC 


GCAAAATAAG 


TATCAAATCC 


TAAGTTTTTA 


AATCCTTCTA 


ATAATCCTTG 


1920 


45 


TGAAATCACT 


CTGTACGCTT 


CTGTAACTGT 


AGAAGGCATA 


TTCGGATGCG 


ATTCAGGCAC 


1980 




AATCACACTG 


TAAGTTAACT 


CTTTATCATG 


TAGCACCCCA 


CGGCCACCAG 


TTTGACGCCT 


2040 




TACGAGACCA 


AAACCTTTCT 


CTTTAACCTT 


ATCAATATCA 


ATTTCTTTTT 


GTAGCCTTTG 


2100 


50 


GAAATACCCT 


ATTGATAATG 


TTGCAGGATT 


CCATGTGTAA 


AAACGTATAA 


CTGGATCAAT 


2160 




TTCACCTCTA GAGACAAAAT 


TTAATAACGC 


TTCATCCATT 


GCCATATTAT 


AATATGGGTC 


2220 



55 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 234 0 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 24 00 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2 4 60 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 520 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 2640 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 2700 

CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 27S0 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACX3ATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 294 0 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 30 60 

TTTCTTGTTT AAATAGAAAT TGTCmTTC AATTGATTTT GAAACCATTA TCCTTAAATC 312 0 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 324 0 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 3300 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 3360 

TCAATCATCA TACCTTCTTC AACATTTAAT GGGAAGTATA TTGTTGGTGG ATGTACACCG 3420 

AAATCTAATA ATCX3CTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 3480 

CCACJTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 354 0 

AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 3600 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3 780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 3 840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 402 0 
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GATTTAAATC 


CTGCAAATGa AGCTGAGGCT 


GGaTTCGTAC 


CATGCGCAGA 


ATCTGGCACA 


4140 




ATGACTTCAT 


CACGATGACC TTCACCATTA 


TTCTCATGGT 


AAGCTTTAAA 


TATCATCAAT 


4200 


s 


GCAGTCCATT 


CACCATGTGC GCCAGCAGCT 


GGTTGTAATG 


TCACCTCATC 


CATACCAGTA 


4260 




ATTTCTTTTA 


ATTCTTCTTG CAAACTATAA 


ATAATTTCTA 


ATGAACCTTG 


AACTTGATCT 


4320 




TCATCTTGTA 


ATGGATGTGA TTCACTAAAT 


CCTGGTATTC 


TAGCAACCTT 


TTCATTAATT 


4380 




TTAC3GGTTAT 


ACTTCATCGT ACATGAACCC 


AATGGATAAA 


ATCCGTTGTC 


TACACCGAAA 


4440 




TTTTTATTTG 


AAAGTTCAGT ATAATGACGT 


ACTAAGTCTA 


GTTCAGCAAC 


TTCAGGAAAC 


4500 


IS 


TCCGCTTTGT 


TTTTACGAAT AAATTTATCA 


TCTAACAATG 


ACTCAACAGA 


ATTTGTTTTA 


4560 


ATATCACTTT 


TTGGTAATGA ATATGCATAT 


CTGCCTTCAC 


GAGATCTTTC 


AAAAATTAAT 


4620 




GGACTTGATT 


TACTAGTCAT TTAACTCACC 


AGCCTTTTCT 


ACAAATGTAT 


CGATTTCATC 


4680 


20 


TTTTGTTCTT 


AATTCAGTTA CAGCTATTAA 


CATGTGATTT 


TTAAAGTCGT 


CTGAAACAAC 


4740 




ACCTAAATCA AAACCACCGA TAATATTGTA 


CTTCACTAAT 


TCCTCGTTAA 


CTTGTTGAAT 


4800 




TGGTTTGTCA 


AATTTGACTA CAAACTCATT 


GnmAAGnTGT 


ACCATCTAAT 


ACTTCAAAAC 


4860 


25 


CTTTTTTAAT 


AAATTGTTGT TTAGCATAGT 


TAGCATGTTC 


TATATTTTGA 


ACTGCAATAT 


4920 




CATAGATACC 


TTGTTTACCA AGTGCTGACA 


TTGCAATTGA 


TGaCGcTAAA 


GCATTTAATG 


4980 




CTTGGTTAGA 


ACAAATATTA GATGTCGCTT 


TATCGCGTCG 


AATATGTTGT 


TCACGTGCTT 


5040 




GTAATGTTAA 


TACAAAGCCA CGATTACCTT 


CATCATCTTG 


TGTTTGACCG 


ACTAATCTAC 


5100 




CTGGCACTTT 


ACGCATTAAC TTTTTCGTCG 


TTGCAAAATA 


TCCACAATGT 


GGCCCACCGA 


5160 




ATTGAGCAGG 


AATTCCGAAT GGCTGAGTAT 


CACCTACAAC 


AATATCTGCA 


CCAAATGAAC 


5220 


35 


CTGGAGGTGT 


AAGTAATCCC AATGCTAATG 


GATTTGCATA 


TACGATAAAT 


AATGCTTTTT 


5280 




TATcSrCAAT 


AAAGCTATGA ATCTTTTCAA 


GATCTTCAAT 


TGAACCGTAA 


AAGTTTGGAT 


5340 


40 


ATTGTACTGC 


AACAGCTGCT GTTTCATCAT 


CCACTGCTGC 


TTCTAATTTT 


TTCAAATCTG 


5400 




TAACAGTGCC 


ATCTAAATCG ATTTCCACTA 


CTTCGAATTC 


CTTACGCGTC 


TTAGCATAAG 


5460 




TATGAAGTAC 


TTGTAATGCT TGATAATGTA 


AACCTTTTGA 


GACTACAATT 


TTATTTTTCT 


5520 


45 


TTGTTTGACT 


AAATGCTAAG ATACATGCTT 


CAGCAAAGCT 


AGTCATCCCA 


TCATACATAG 


5580 




AAGAATTTGC 


TACATCCATA TCTGTTAATT 


CACAAATTAA 


AGTTTGGAAC 


TCAAAAATGG 


5640 




CTTGTAATTC 


ACCTTGAGAA ATTTCCGGTT 


GATATGGCGT 


ATATGCTGTG 


TAAAATTCTG 


5700 


SO 


ATCTTGAAAT 


CATAGCATCC ACAACTGATG 


GCGCGTAATG 


ATCATAAACA 


CCAGCACCCA 


5760 




rAAATGATGT 


ATGCGTTTCT TTAGTGATAT 


tCTTGCTkGC 


AATGGGGATT 


TAAACnTCTA 


5820 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATnATAATTG GCTTTGCTAA TAATTACTTC CCTGAATTAC aAGTATTAGC AAACGAAATA 60 

AAATCTGATA TGGCTAGTTC ATTAAAACAA TGATATTTTT ATTTAAATTT TTeiAAGCTTT 120 

GTACGAAATT GTACAAAGCT TTTTTGGTGC GTATTGTATG GGCAACAACT TGACGATGAA 180 

AATCCGTTAC AGGATTGGTA ATAGGAAATG TTAGCGAAAG ACAAGGGTAT CCATTGTAGA 24 0 

TTAACAAAAG 6ACGTTTCCA CAAGTGTGGG TTATTCTCAC TAAAGCAATA CGCAGAGACA 300 

ACTTACGTAA AATTTTGAAC TGACTAGAAC GGAACTTCTA CTCAATTATT GATAAAAATT 360 

TTCAAAAAGA CTTGAATGTG CTGAGAATAC GAAGTTTATG GAAGGATTAT CAAAATATAA 420 

ATGTGCATTC ATTTACAACC TTTATTGACA ATGATTCTCA ACTAATATAG TATATAATCA 4 80 

AATCGTAATA GTTACGATTT GTTTTCTGCA ACTTTTTTGA AGTTTTAGTT GAGGTGAAAA 54 0 

CAATAAAAGC ATCTAAGTGA ATGTAGTTAA CGGACAACTG CATTCGCTTG TAGAGCCACA 600 

AGAAGCAACT TTAAATAAGG TTTACGGTTG CATTTTGATA CAACAACCGA TTACTAAGTC 660 

ATGCTTTCCA CTTTGCGGGT TAGCATGACT TACCTAATAG ATAGAGCTAT TAGGTTCAGC 720 

TTCTAAAAAA TTACAGTTTT AGAGGAATAC AGTTGcTTGc tTCGCAACAA CTGCATAAGA 780 

GCCATGGTTT TCGCTTTTGC GAATTAGCAT GACTTACCTA CTAGATAGAG CTATTAGGTT 840 

CATCTTCTAA AAAATTACAG GTTTAGAGGA ATACAGTTGT TTGcTTCGCA ACAACTGCAT 900 

AAGAGCCTCT AGTAATTAAA ATTACAGAGG CTCTAAAAAT ACATCTAAAG GAGTGTCGTA 960 

TGAATCGGCA GGTTATAGAA TTTTCTAAGT ATAATCCTTC GGGGAATATG ACGATACTTG 102 0 

TTCATTCAAA ACATGATGCT AGTGAATATG CATCTATCGC CAATCAGTTG ATGGCCGCAA 10 8 0 

CACATGTATG CTGTGAACAG GTAGGCTTTA TAGrATCAAC ACAAAATGAT GATGGTAATG 114 0 

ATTTTCACTT AGTTATGAGC GGTAATGAAT TTTGCGGTAA TGCGACGATG TCATATATAC 120 0 

ATCATTTGCA GGAAAGTCAT TTGCTTAAAG ACCAACAGTT TAAGGTGAAG GTGTCTGGCT 1260 

GTTCGGATTT AGTGCAATGC GCAATTCATG ATTGCCAATA CTATGAAGTT CAAATGCCAC 1320 

AAGCCCATCG TGTTGTGCCA ACAACAATTA ATATGGGTAA TCATTCATGG AAAGCAATAG 1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG 15 00 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 1560 

5 AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 1620 

TTAATAATTA TCAACGTAAT GACX3CATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 168 0 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 174 0 

AGGTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 1860 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

IS 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACXJTTGAA 198 0 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 2 04 0 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 2160 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 222 0 

25 GGTGCTTCAG TTATCGGTAT TGATATTGAT CCy^CAAGCCG TTGACCTAGG GCGCAGAATC 22 80 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT ATCTGAACTT 2 34 0 

AAAGATATCA AAGATGTGAC GCATATCATA TTCAGCTCGA CAATTCCTTT AAAGTACAGC 24 00 

^° ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 60 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2520 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2 58 0 

35 

GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 264 0 

AATGCAATTA GCGAATATTT GCTATTTAAA ATCAGATTAT GAGATTGATA TGGTTGGACG 2700 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 276 0 

40 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2820 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2 88 0 

45 AGCAGATGCT TATTATGACA CA.CTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2940 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3 060 

^ TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 312 0 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTGAGCAATT 318 0 

55 
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TTATGTGCAC CCACCACTAT TTATGAATGA CTTTTCATTG AAAGCCATTT TCGAAGGAAC 330 0 

AGATGTACCG GTTTATGTGT ATAAGTTATT TCCTGAAGGA CCGATAACGA TGACACTAAT 3 360 

CCGTGAAATG CGTTTAATGT GGAAGGAAAT GATGGTTATT TTACAAGCAT TTAGAGTGCC 3420 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA GGAAAATTAT CCAGTACGTC CTGAAACTTT 34 BO 

GGATGAAGGT GATATTGAGC ATTTCGAAAT CTTGCCAGAT ATCTTACAAG AATATCTGCT 3 540 

TTATGTAAGA TATACCGCAA TCCTCATTGA TCCATTTTCA CAGCCAGACG AAAACGGACA 3 600 

TTACTTTGAT TTTTCAGCTG TACCATTTAA GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3660 

TCAAATTCCA AGAATGCCAA GTGAAGATTA TTACAGAACG GCGATGATTC AGCATATTGG 3720 

GAAAATGCTA GGTATCAAAA CGCCAATGAT TGATCAGTTC CTAACTCGCT ATGAAGCAAG 3 780 

TTGCCAGGCG TACAAGGATA TGCATCAAGA TCAACACTTA TCTTCTCAAT TTAATACAAA 3840 

TCTATTTGAA GGAGATAAAG CACTCGTCAC AAAATTTTTG GAAATCAATA GAACGCTTTC 3 900 

ATAATAAGGG TTTGAAGTTT TATAATAGAA AAAAATTATT GAATTATGTT TGACATTTAC 3 96 0 

ATAAAAATAA GCAAATAATT GAGAAAAATA ATCATTACGA TTTGATTAAG TAATGCAACT 402 0 

TATCAATTTA GAAAGAGGAA AAGCAAATGA GAAAACTAAC TAAAATGAGT GCAATGTTAC 4 08 0 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 414 0 

AAAACAAGCA ATTAACGTAT ACGACGGTTA AAGATATCGG TGATATGAAT CCGCATGTTT 4 2 00 

ACGGTGGATC AATGTCTGCT GAAAGTATGA TATACGAGCC GCTTGTACGT AACACGAAAG 42 60 

ATGGTATTAA GCCTTTACTA GCTAAAAAGT GGGATGTGTC TGAAGATGGG AAGACATACA 4320 

CGTTCCATTT GAGAGATGAC GTTAAATTCC ATGATGGTAC GCCATTTGca TGctGACGCA 4 3 80 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA AACAAAAAAT TGCATTCTTG GTTAAAGATT 444 0 

TCGACATTAA TTGACAATGT TAAAGTTAAA GATAAGTACA CGGTTGAATT GAATTTGAAA 4 500 

GAAGCATATC AACCTGCATT GGCTGAATTA GCGATGCCTC GTCCATATGT ATTTGTGTCT 4 560 

CCAAAAGACT TTaAAAACGG TACAAcAAAA GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4 62 0 

CCATTTAAAT TAGGTGAACA CAAAAAAGAT GAGTCTGCAG ACTTTAACAA AAATGATCAA 4 68 0 

TACTGGGGCG AAAAGTCTAA ACTTAACAAA GTACAAGCAA AAGTAATGCC TGCTGGTGAA 4 74 0 

ACAGCATTCC TATCAATGAA AAAAGGTGAA ACGAACTTTG CCTTCACAGA TGATAGAGGT 4 800 

ACAGATAGCT TAGACAAAGA CTCTTTAAAA CAATTGAAAG ATACAGGTGA CTATCAAGTT 4 860 

AAGCGTAGTC AACCTATGAA TACGAAAATG TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4 920 

GCTGTGAGTG AC7VAAACAGT CAGACAAGCG ATTGGTCATA TGGTAAACAG AGATAAAATT 4 980 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 
TTAGATGAAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 516 0 

^ AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 522 0 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 52 8 0 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 534 0 

10 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 54 00 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAGCATTGA TGACGCATTT 546 0 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 5520 

IS 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 5580 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 564 0 

AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5820 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 588 0 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 594 0 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 6 0 00 

^° ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6 060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 6120 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 6180 

35 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA 6240 

AATtMTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 6300 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 6360 

40 

GTATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 6420 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 80 

45 CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6540 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6600 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 6 660 

so TCTTTTTAGG ATTAGCAGCA CCA.CTTGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6720 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6780 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6 90 0 

TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6 95 0 

^ TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATATTATC ATGGCATTTA 7 02 0 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7 08 0 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 714 0 

AACATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 72 0 0 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA 732 0 

15 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 7440 

AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 768 0 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGACATT GCGATGGTCA TGCAACAAGG 774 0 

TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7800 

^ ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 786 0 

AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACCCT TACATGTTAT CAGGAGGAAT 79 2 0 

GTTACAGCGA TTGATGATTG CTTTAGCGTT AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7980 

^ TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 804 0 

ThTUiMiPAA CACTTTGACT GTGCGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

40 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 8340 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 8460 

so TAACCTTAAA TGATCAACCG ATGCATAAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8520 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 8580 
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TGTTGGAAGA AGTCGGTCTA TCTAAGGCAT ACATGGATAA 
GTGGAGAGGC GCAACGTGTT GCGATTGCGC GTGCAATATG 
TGTTTGATGA AGCCATTAGT TCACTCGACA TGTCAATTCA 
TGATTCATTT ACGTGAAACG CGTCAGTTGA GTTATATTTT 
CTGCCACGTA TTTATGTGAT CAATTAATTA TTTTTAAAAA 
TTCCGACAAG CGCATTGCAT AAAAGTGACA ATGCTTATAC 
AACTATCATT CTAAGGAGTG AGATAATGAA AGGTGCAATG 
ATATATATTA ACATTGATGT TCTTTAGTGC CAATGCAATC 
ACGAGGGCAT GATTTAGGCG CAACGAATAC GGTTATCGGT 
GTTAACAGCA ATGGTATTTC GACCATGGGC AGGACAAATT 
TAAAGTATTA AGAATTATTT TGATTATCAA TGCCATAGCT 
TGGCTTAGAA GGTTATTTCG TAGCACGTGT TATGCAAGGT 
TATGTCTTTA CAGCTAGGTA TTATTGATGC ATTACCAGAG 
ATCATTGTAC TCGCTATTTT CAACGATTCC AAACTTAATC 
TATTTGGAAT GCAAATAATA TTTCACTATT TGCAATTGTC 
AACAACATTC TTTGsTATCG CGTGACCTTT GCTGAACAGG 
ATTGAAAAAA TGCCGTTTAA CGCTGTAACT GTTTTTGCGC 
TTGTTGAACA GTGGTATTAT CATGATTGTT GCATCGATTG 
TTTGTACCGT TATACACAGT GAGTTTAGGA TTCGCGAATG 
CAGGCCATCG CAGTTGTTGC GGCAAGATTT TACTTAAGGA 
ATGTEGCATC CTAAATATAT GGTATCTGTA CTATCATTAT 
GTGGCATTTG GTCCGCAAGT AGGTGCAATT ATTTTCTATG 
ATGACGCAAG CAATGGTGTA CCCAACATTA ACATCATACT 
GTAGGTCGTA ATATGTTGTT AGGTTTATTT ATTGCCTGTG 
GGTGGCGCAT TGATGGGACC TATTTCCGAT TTAGTAGGAT 
TGTGGTATGT TAGTCATTGT AATAATGATT ATGAGTTTCT 
CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAGCATATTA 
TTAAAAGGTA TATTGaGCAT GGCGATTCAT GTGCTTCATG 
TATGGCTCGT TTTTAGAACG ACAtATATCT AAATAAAGCA 



ATATCCTAAT ATGTTATCAG 
TATTAACCCT AAATATATTT 
AACACAAATA TTAGATTTAT 
TATCACACAT GATATTCAAG 
CGGAAAAATA GAAGAACAAA 
AAGAGAATTA ATAGAAAAAC 
GCTTGGCCCT TTTTGAGATT 
TTAAACGTGT TTATACCTTT 
ATCGTTATGG GGGCATACAT 
ATTGCTCGTG TCGGTCCCAT 
TTAATTATTT ATGGTTTTAC 
GTGTGTACGG CATTCTTTTC 
GAACATCGTT CTGAAGGTGT 
GGACCATTAG TTGCCGTAGG 
ATTATCTTTA TCGCATTAAC 
AACCCGATAC GTCAGATAAG 
AATTTTTCAA AAATAAAGAG 
TATTTGGTGC AGTTAGTACA 
CGGGAATCTT TTTGACAATA 
AATACATTCC GTCAGATGGT 
TAGTAATCGC GTCATTTGTA 
GTAGTGCGAT ATTAATAGGA 
TAAGCTTCGT CTTACCAAAA 
CAGACTTAGG TATATCGTTA 
TTAAATGGAT GTATCTAATT 
TGAAAAAGCC AACACCACGT 
AGTTAATGAA TATTTAAATT 
CTAGGACATG AAACATTCTA 
CGCTTArAAG TGAGTTTTGA 



8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9430 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
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TTACATGAAA ATATGCAAAA CGAGTATAAC TGCTAATTGA TAGAAATAGC TCACCATAAA 10500 

ATTACGGTAT GATTTTAAAT ATAAGTAAGT CGCACTACCT GCTAGTATCA ATGCTGGAAT 10560 

GAATTCCCAC CATGTATTAA TGTATGGATA GTAGAACAGA GTTTCAAGGA TAATGGACAA 10620 

TACTATTGTA ATCTTTAAAG GTATTAATCT GCTTAATTCT TGAATTAAAA TATGACGGAA 106 8 0 

AATAAGTTGA CAAATCAAAG TATTTAATAT AATGGTTAAC GAAAATATAG CTATTAAACT 10740 

GATGGAaCCA TACCCTTTAA TGAGCGGGTA AATGTCAAAG ACAGTAAAGG AATCTACATT 10800 

TAGTGCGAAA ATATTGAAAT GATTTAAAAG TAAAAAGAGT ACGACACTTA GTGTAAATGA 1086 0 

TATAAGAATA TGCCATTTAT ATTTAGCACT AGCAACGATT TGCGAACGTA TCATTGGAAT 10920 

AAACGCATCT TCATGCATCA GACGAAAAAT AGCTAGTGAA ATAATAACTG CGAGTAAATA 10980 

GCTAATGTTC ATTGAAATAG GAAAAGAGAA ACCCCACGGA GCTTGTTGAG TGAATACAGC 11040 

TACTAACCCA AAAGTTAAAA AGACGATAAT GATCGGCAAG ATGTTAACCA AAAATATGTA 11100 

AAGGAAAATA AATCCAATAT CACGTTTGAA AAAACGCGAT TGTTCGGTAG CGTATTCTTC 111 SO 

TTCTATGTAA TGTTTATTTG TATTTGACAT AGTATACCTC TTAAATAGTT GTATTATATA 11220 

GATACTTTAG CACATATTAC TTTGTATTGT ATGTTTTATA CATTAAAATT TAAAATGAAA 1128 0 

AACATATCAT AAAATTGTTT TATAAAATGA AGCGCTTCCA TTGTGTTTTG TTTTGTAAGG 11340 

TGTATCATAA ATATTGAATT GAAATTTTGG GGGGAGGTAT TGTAATGACX; TTTCTTACAG 114 0 0 

TCATGCAATT TATAGTTAAC ATTATCGTTG TAGGATTCAT GCTTACGGTT ATTGTTATCG 11460 

GGCTTATTTG GTTAATTAAA GATAAAAGAC AATCACAACA TAGTGTATTA AGGAATTATC 11520 

CTTTACTAGC ACGTATTAGA TATATTTCAG AAAAAATGGG ACCGGAATTA CGTCAGTATT 11580 

TATTTTCTGG GGATAATGAA GGGAAACCTT TTTCACGTAA TGATTATAAA AATATCGTTT 11640 

TGGCfiGGAAA ATATAACTCT CGTATGACCA GCTTCGGTAC TACTAAAGAT TATCAAGACG 11700 

GCTTTTACAT ACAGAACACA ATGTTTCCGA TGCAACGTAA TGAGATTTCA GTAGATAATA 11760 

CAACATTGTT ATCAACATTC ATTTATAAAA TCGCGAATGA GCGTTTATTT AGTCGTGAAG 1182 0 

AATATCGTGT GCCGACAAAG ATTGATCCGT ATTACTTAAG TGATGACCAT GCAATAAAAT 11880 

TAGGTGAACA TTTAAAACAT CCATTTATTT TAAAACGTAT CGTAGGACAA TCTGGTATGA 1194 0 

GTTATGGCGC TTTAGGAAAA AATGCCATTA CAGCTTTATC TAAAGGTCTA GCTAAAGCGG 12000 

GCACTTGGAT GAATACAGGT GAAGGTGGCT TATCAGAATA TCATTTAAAA GGTAATGGGG 12060 

ATATCATTTT CCAAATTGGT CCCGGTTTAT TTGGTGTTCG TGATAAAGAA GGTAATTTTA 12120 

GTGAAGGTTT ATTTAAAGAG GTTGCACAGT TATCTAACGT ACGCGCATTT GAGCTGAAGT 12180 
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TTGCTAAAAT 


CCGAAATGTT 


GAACCTTATA AAACAATCAA 


TTCACCTAAC 


CGTTACGAAT 


12300 




TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT TCGTCGATCA 


GTTGCAGCAA 


TTAGGTCAAA 


12350 


5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA AAGTTTCAGA 


AATTGAAACA 


CTTGTACGTA 


12420 




CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT TTATTACGAT 


TGATGGTGGT 


GAAGGTGGTA 


12480 


10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATX3 GTGTTGGCTT 


ACCGCTATTT 


ACAGCTCTAC 


12540 


CTATTGTGTC 


TGGCATGTTA 


GAAAAATATG GTATTCGAGA 


TAAAGTGAAA 


TTGGCGGCAT 


12600 




CTGGTAAGTT 


AGTGACACCA 


GATAAAATTG CGATTGCACT 


AGGTTTAGGT 


GCAGATTTTG 


12660 




TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG TCGGTTGTAT 


AATGAGTCAA 


CAATGTCACA 


12720 




TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA CAGATGCGAA 


GAAAGAAAAA 


GCATTGATTG 


12780 




TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT ATGTAACAAG 


TTTGCATGAA 


GGCTTATTCA 


12840 


20 


ATATTGCAGC 


AGCTGTTGGC 


GTATCCAGTC CTACAGAAAT 


TACTGCTGAT 


CATATTGTAT 


12900 




ATCGAAAAGT 


CXSATGGTGAG 


TTACAAACGA TACATGATTA 


TAAATTAAAA 


CTCATTAGTT 


12960 




AACTTAATTA 


TTTCGGGAAA 


TTGAAAGCAG CGGATTTTAG 


CGTTACTGCA 


AATAATTTTA 


13020 


25 


TATTAGTAGT 


GGATGCTGGT 


CACACAAGAA CTTCAAATAT 


TAAAGCCCTC 


AGAATATGAA 


13080 




TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG GGCATTTTTA 


AGTTATAAAC 


TATTTGTCGT 


13140 




CCATTTTATC 


TTTTTCTTTT 


AAACCTCTGT GCTTTAATTG 


CTTTTCAAGT 


TTTTCAAAAC 


13200 


30 


TAATATCTTT 


ATTTTCTTTA 


GTCGAAACAC CAAGACGTTT 


ATTTAATTTT 


TTCATGTCAA 


13260 




CTTCTGTGTA ATCTATGTCT AAGTGyTCAA TTGCTTTTTT ATCTTTATAG TCTACTTTGT 


13320 


35 


ATTTTACGCC 


TTTAAGGTCT 


TTGAAAATAC TTTCAGATTT 


GGCGAATAAC 


TTTTTGGCTT 


13380 


CGTCTTTATC 


CATACCTAGA 


TCGTCATATT TAATTGTGTT 


GATTGTAGAC 


TGTTTTAAAA 


13440 




CTTfATCATC 


TTTATATGTG 


ATAGAAGTTA GTACATGTTT 


ACCACTAACA 


TCACCWTCAT 


13500 




ATGTTTTGGT 


TTGTTCTTTA 


CCACAAGCTG ATAATGCAAT 


GATACAAACT 


AATGCTACTA 


13560 




CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA GTCGCCTTCT 


TTCGATATTT 


GTATTATAAA 


13620 




GAAATTATAA 


CATTTACTAA 


AAAATGATGT TATTCAAAAA 


TTTAAATTTT 


GTCATTTTTT 


13680 


45 


TTGAAGATAT 


GAGTTTTTTT 


AAGCGGATTC CTCACAAAAT 


TTTAAAAATA 


TTTAAGCCTk 


13740 




AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13B00 




GAAATACAGG 


AGGATGAATA 


ACATGAATCA GTCAGTCAAA 


TTACTTAAAC 


ATTTAACAGA 


13860 


50 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 




GCCTGTCAGT 


GATCAAATTA 


TTGAAGATAA CTTGGGTGGC 


ATTTTTGGAA 


AGAAAAATGC 


13980 
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AACAAAGATT GATAAACATG 


GTTTTATTTC 


ATTTACGCCA 


kTgGTGGATG 


GTGGAATCAA 


14100 


GTCATGCTAT CTCAAAAAGT 


AACGATTACA 


ACAGATTCGG 


GCAAAGAAAT 


TAGAGGTATC 


14160 


ATCGGTTCTA AACCGCCACA 


TGTCTTAACG 


CCTGAAGAAC 


GTAAAAAGCC 


AATGGAAATC 


14220 


AAAAATATGT TTATAGATAT 


TGGTGTTAGT 


AGCAAGGAAG 


AAGCTGAAGA 


AGCTGGCGTT 


14280 


GAAGTAGGCA ATATGGTTAC 


GCCATATAGT 


GAATTTGAAG 


TGCTTGCAAA 


TGATAAATAT 


14340 


TTAACTGCGA ArCATTTGAT AATCGCTATG 


GCTGTGCATT 


AGCTATTGAG 


GTATTAAAAC 


14400 


GTTTAAAAGA TGAAAATATT 


GGCATTAACT 


TATACAGTGG 


TGCCACAGTG 


CAAGAAGAAG 


14460 


TTGGTTTGCG TGGTGCGAAA 


GTGGCAGCGA 


ATACGATTAA 


ACCAGACTTG 


GCGATAgcTG 


14520 


TcGATGTAGG TATTGCTTAT 


GATACCCCAG 


GTATGTCAGG 


TCAAACGAGC 


GATAGTAAAC 


14580 


TAGGCGGTGG TCCAGTTGTC 


ATTATGATGG 


ATGCTACAAG 


TATTGCTCAC 


CAAGGTTTGC 


14640 


GAAAgcATaT TAAAGATGTA 


GCTAAGGAAC 


ATAACATCGA 


AGTACAATGG 


GATACGACAC 


14700 


CAGGTGGAGG TACAGATGCG 


GGAAGTATTC 


ATGTCGCAAA 


TGAAGGTATT 


CCAACGATGA 


14760 


CAATCGGTGT TACGCTGCGA 


TACATGCATT 


CTAATGTTTC 


AGTGCTCAAT 


GTAGATGATT 


14820 


ATGAAAATTC TATCCGTCTT 


GTTACTGAAA 


TTGTCCGTTC 


ATTGAATGAT 


GAAAGTTATA 


14880 


AAAATATCAT GTGGTAATCA 


AATCCATAAA 


TAATAAAGAA 


TCCTTTTAAT 


ATGGTAGGTT 


14940 


GTTAAACAAT TGTCTAATTT 


TAATTCTTAG 


TCATTAGACA 


GTATCCATGT 


TAATAGGATT 


15000 


TTTTGTTTTT AATTTAAATG 


CTGAAAATCA 


ATTATGCCTA 


AATTTTGATA 


TTACAAGAAA 


15060 


ATGATTTTTT CTTAAATGTA ATTGCACTAA 


AAACCAAAAA 


AACGGGAATA 


ATATACCTGA 


15120 


TATATTACAT GAGGAGCGGT 


GCAAATGTTG 


TTAGAAATTA 


AAGATTTAGT 


GTATAAAGCG 


15180 


AGCGATAGAA TCATACTAGA 


TCATATCAGT 


CTAAAAGTAG 


ATAAAGGCGA 


GAGTATTGCC 


15240 


ATTATAGGTC CATCAGGTAG 


TGGTAAAAGT 


ACATTTCAAA 


AGCAAATATG 


TAATTTGTTT 


15300 


AGTCCAACTA GTGGAGAACT 


TTATTTTAAA 


GGTAAACCCT 


ATAATGATTA 


TGACCCGGAA 


15360 


GAATTGCGTC AACGAATCAG 


TTATTTGATG 


CAGCAAAGTG ACTTGTTTGG 


TGAAACGATT 


15420 


GAAGATAACA TGATATTCCC 


ATCACTTGCA 


CGTAATGATA 


AATTTGATAG 


AAAACGTGCA 


15480 


AAGCAATTAA TTAAAGATGT 


CGGTTTGGGA 


CATTATCAAT 


TAAGTTCGGA 


AGTGGAAAAT 


15540 


ATGTCGGGTG GTGAGCGGCA 


AAGAATTGCT 


ATAGCGCGCC 


AACTGATGTA 


TACACCGGAT 


15600 


ATTCTTTl'AT TAGATGAATC 


GACCAGTGCA 


TTAGACGTTA 


ATAATAAAGA 


AAAGATAGAA 


15660 


AATATCATTT TTAAATTAGC 


AGATCAAGGC 


GTGGCAATTA 


TGTGGATTAC 


CCACAGCGAT 


15720 


GACCAAAGTA TGCGACACTT 


TCAAAAGCGT 


ATAACAATTG 


TTGATGGTCA 


AATTTCTAAT 


15780 
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CATTCCGATT ATCATTTCAT ATAAAGAAGG 


TTTACATATT 


ATTAAAGATT 


TAATTGTTGC 


15900 




GACATTACGA GCAGTTGTGC 


AATTAATCAT 


TTTGGGATTT 


TTGCTGCATT 


ATATTTTTAA 


15960 


5 


AATAAACGAT AAATGGCTGC 


TTATTTTATG 


TGTATTGGTC 


ATTATTATTA 


ATGCATCATG 


16020 




GAATACAATT AGTCGAGCAT 


CACCAGTGAT 


GCATCATGTG 


TTTTGGATAT 


CATTTCTAGC 


16080 


10 


TATCTTCATT GGAACGGCAT 


TACCGCTTGC 


AGGTACTATT 


GCGACAGGGG 


CCATTCAATT 


16140 


TACCGCAAAT GAAGTTATAC 


CTATCGGCGG 


CATGCTTGCA 


AATAATGGCT 


TGATTGCAAT 


16200 




TAATTTAGCT TACCAGAATT 


TAGATCGTGC 


ATTCGTACAA 


GATGGTACTA 


ATATTGAATC 


16260 




TAAATTATCA CTTGCAGCTA 


CACCTAAATT 


GGCTTCTAAA 


GGTGCAATAC 


GTGAAAGTAT 


16320 




TCGTTTAGCT ATAGTGCCAA 


CTATTGATTC 


GGTTAAAACA 


TATGGGCTTG 


TGTCGATTCC 


16380 




TGGTATGATG ACAGGCTTAA 


TTATTGGTGG 


CGTACCACCT 


TTACAAGCGA 


TTAAATTTCA 


16440 


20 


ATTGTTAGTC GTGTTTATTC 


ATACAACTGC 


GACCATTATG 


TCTGCTTTGA 


TTGCGACATA 


16500 




TTTAAGCTAT GGTCAATTTT 


TCAATGCAAG 


ACATCAATTA 


GTAGCACGAA 


ATACTGATGT 


16560 




TAAGAGTGAA TCATGATAGA 


TTTTACTGCA 


TCAGATTTAG 


GCATTAGTTT 


TAATTGGAAA 


16620 


25 


TGAAGTGACG CGCACATATA 


GTATCGCTAT 


TCATTAGCGC 


AGCGAAAATA 


TTCATAAAGG 


16680 




CACGCATACT TTGTAGTCAG 


TTATCTGTTC 


TGACATATAA 


AGCGTGCGTG 


CTTTTTTGGA 


1S740 




GTTATTGTTG AAACTGAAGT 


AATTATACAT 


AATTATTAAA 


TGACATACTT 


GTGTTAATTT 


16800 


30 


TTCAAATACT GAAAAACAAT 


TTCaATAATT 


TTCCaATTAA 


GCACAGAAAA 


TTAAAGCAAA 


16360 




ATATTATATA ATAGAACGGT TATATATaAA nATTngTgCA CACATTTTTT AATAAATCGT 


16920 




TATTCTAAGG GAAATGAATA 


TCGGAAATTT 


TGTTTGAAAG 


GAGTTTTAAA 


TTGTCAATCA 


16980 


35 


TGCGACTATT TACATTCATT 


TTAAGTATTT 


TTATCGTAGG 


AATGGTTGAA 


ATGATGGTTG 


17040 




CAGGAATTAT GAACTTGATG 


AGTCAGGACT 


TACATGTATC 


AGAAGCTGTC 


GTTGGTCAAT 


17100 




TAGTGACAAT GTACGCTTTA ACATTTGCGA TATGTGGACC 


TATTCTGGTT 


AAATTAACGA 


17160 




ACCGTTTTTC ATCAAGGCCT 


GTATTATTAT 


GGACATTACT 


TATATTTATC 


ATTGGTAATG 


17220 




GCATTATTGC TGTAGCGCCA 


AATTTTTCaA 


TATTAGTAGT 


TGGTAGAATT 


ATCTCATCTG 


17280 


45 


CAGCAGCAGC ACTAATTATC 


GTAAAAGTAT 


TAGCTATTAC 


AGCGATGTTA 


TCAGCACCTA 


17340 




AAAATCGTGG TAAAATGATT 


GGACTTGTCT 


ATACAGGGTT 


TAGTGGTGCT 


AATGTTTTTG 


17400 




GTGTACCAAT TGGAACGGTT 


ATCGGCGATT 


TAGTAGGTTG 


GCGCTATACA 


TTTCTATTCT 


17460 


SO 


TAATTATTGT GAGTATTATT 


GTTGGCTTCT 


TGATGATGAT 


CTATTTACCG 


AAGGATCAGG 


17520 




AAATACAACG AGGCCCTGTG 


AATCATGAGA 


CACCATCTCA 


TGAAAATCAT 


GTTACTTCGA 


17580 
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CAAACTCAGT GACATTCGTC TTTATAAATC CACTTATTTT ATCTAATGGT CATGATATGT 177 00 

CATTCGTTTC ATTAGCACTT CTAGTAAATG GAATCGCTGG CGTTATTGGA ACATCATTAG 1776 0 

5 

GTGGTATATT CTCCGATAAA ATTACAAGTA AGCGTTGGTT AATGATTTCT GTTTCTATTT 17820 

TTATCGTCAT GATGTTACTT ATGAATTTAA TCTTACCTGG TTCAGGTCTA TTGTTAGCAG 1788 0 

GACTATTTAT TTGGAATATC ATGCAATGGA GTACTAATCC AGCAGTGCAA AGCX3GTGTGA 17940 

10 

TTCAACATGT TGAAGGCGAC ACAAGCCAAG TAATGAGTTG GAACATGTCT AGTTTAAACG 18000 

CTGGTATTGG TGTTGGAGGC ATTATTGGAG GCTTGGTCAT GACACATGTT TCTGTTCAAG 18060 

CTATCACATA TACGAGTGCC ATCATTGGCG CATTAGGATT AATCGTTGTT TTCACATTGA 18120 

AAAATAATCA TTATGCTAAA ACATTTAAAT CATCATAATT CTCATATGAm AAGCACGCCT 18180 

GCTATCAAAT TCAGGTGTGC TTTTTTAGAT GCGATAACGT TATTGATATG TGCGATAATA 18240 

20 GCGACGTTCA TTATGATACA TCGGCCT^GG CATTTTACCG CTTTTAGCAA AATTAGCTAA 18300 

ATCATTTTGC ATTTGTCGAC TTAAAAATTT AAGGTGaGCA GTTGTTGGaT ATgAT 18355 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRAITOEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



35 


CGCAAAGAAG 


TACAAAAAAT 


GTTTTTACAA GAAGGTATTA 


AAACACCTCA ACCAATTATG 


60 


ACTGCTTATA 


ATCATAGTGA AAACGgTGTT TAGTAGTTTA 


TAATACATGG 


AGGTCATATT 


120 




TAATGGCGTC 


AAAATATGGA 


ATAAATGATA TAGTAGAAAT 


GAAAAAACAA 


CATGCGTGTG 


180 


40 


GAACAAACCG 


TTTTAAGATT 


ATTAGAATGG GTGCAGACAT 


AAGAATTAAA 


TGTGAAAATT 


240 




GTCAAAGAAG 


TATTATGATT 


CCACGTCAAA CGTTTGATAA 


AAAACTTAAA 


AAAATCATCG 


300 




AATCTCATGA 


TGATACACAA 


AGATAGGAGA ATGATTAATG 


GCTTTAACAG 


CAGGTATCGT 


360 




TGGATTGCCA 


AACGTTGGTA 


AATCAACATT ATTTAATGCA 


ATAACAAAAG 


CAGGTGCTTT 


420 




AGCAGCGAAC 


TATCCATTCG 


CTACGATTGA TCCTAATGTA 


GGGATAGTAG 


AAGTGCCAGA 


480 




TGCTAGATTA 


CTTAAATTAG 


AAGAAATGGT TCAACCTAAA 


AAGACATTGC 


CGACTACATT 


540 


50 


TGAATTTACA 


GATATCGCTG 


GTATTGTGAA AGGTGCTTCA 


AAGGGAGAAG 


GGTTAGGTAA 


600 




TAAATTCTTA 


TCACATATTA 


GAGAAGTAGA TGCGATTTGT 


CAGGTCGTTC 


GTGCATTTGA 


660 
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TAATATGGAA TTAGTACTAG CGGACTTAGA ATCTGTTGAG AAACGTTTGC CTAGAATTGA 780 

AAAATTAGCA CGTCAAAAAG ATAAGACTGC TGAAATGGAA GTACGTATTT TAACAACTAT 84 0 

TAAAGAAGCT TTAGAAAATG GTAAACCCGC TCGTAGTATT GACTTTAATG AAGAAGATCA 90 0 

AAAATGGGTG AATCAAGCGC AATTACTGAC TTCTAAAAAA ATGCTTTATA TCGCTAATGT 960 

TGGTGAAGAT GAAATTGGTG ATGATGATAA TGATAAAGTA AAAGCGATTC GTGAATATGC 1020 

AGCGCAAGAA GACTCTGAAG TGATTGTTAT TAGTGCAAAA ATTGAAGAAG AAATTGCTAC 1080 

ATTAGATGAT GAAGATAAAG AAATGTTCTT AGAAGaTTTA GGTATCGaAG AACCAGGATT 1140 

AGATCgrTTA ATTAGGAmCA CtTATGAATT ATTAGGnTTA TCCACCATAA TT 1192 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



2S (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



AATATAGCTG 


CAATAGCATC 


TCGTTTCATT 


TGTATAATCA 


ATTCCGGTTT 


AAATATCAGT 


60 


GTGAACGTAA 


GCACGACACA 


GATTAAAAAT 


AACACTGCCG 


GAATGAGTCG 


TTTCAATCGT 


120 


CGCTtCCAAA 


ACTCTAGCAA 


ATCGATTTTT 


TGCGTCCGAT 


AATACTCACT 


TATCAACAAA 


180 


CTTGTTATTA 


AATAACCTGA 


AATAACGAAG 


AATGTATCTA 


CTCCTAAAAA 


GCCCCCACTT 


240 


AACCATTGTG 


CATTCAAGTG 


ATAAATAATG 


ATTCCTATAA 


CTGCGAATGC 


CCTCAATCCA 


300 


TCTAATCCAG 


GTAAGTATCG 


CGGGGAATAC 


ATTTTTTCTA 


AACGTTTAAA 


GTCTTTTGTA 


360 


TCCAfGTTAA 


TAAACGCCCC 


ATTTATTTTT 


CTCTATTTTG 


TAGTATATCA 


CAATATTTTT 


420 


GAAAATAAAA 


TATTGCACTG 


aTTTTCATTA 


ATTGATTTAA 


CCCTTAATTA 


AGATAGTTTT 


480 


AAATTTTTTA 


TTAAGTAGAA 


AACAATTATT 


ACAGTTGATT 


TCATTACTGC 


AAACCACATA 


540 


TAAATTTGTC 


GATTTTACTA 


CATAACATAG 


ATTATCATAG 


ATTCTTGAAT 


TTTTAGCAAA 


600 


ATAACTGTTA 


TTTTCATTAT 


ATTTTTACAA 


AAAAAGGTTC 


GTTTTATATT 


TTATGCATCT 


660 


TACTGTAACA 


GAATCATTAA 


GATATGCTAT 


TCGAATATAC 


TTTTTCAAAA 


TTTATATAAT 


720 


GAATAAATTA 


ACATGTATTG 


AAAAAAAAGC 


GAAATGCAGC 


CTATCCTCTA 


ATGTAAACCA 


780 


AACGATATAT 


CTCGTCAGAC 


TTTATATTTA 


AACGCTATGT 


GTCACTTTTA 


AAATGAATAT 


840 


TACTAAGATT 


GTCATATCAA 


TTATTATTGC 


ATCGAATTAA 


TCTTTTAAAT 


TTCTGTAATA 


900 
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ACGGAAGTCA TTATTAGAAT AAAAATACTG TGCACTAATA AATTTATCAA TTGTTCCTAA 1020 

ATAAATACCA TCGATATTTT GTTCTTTACA TGTCATTATA ACTTTATCTA AAAGTTTTTT 10 80 

^ ACCTATTTTT AAATTCCTAT AACCTTTATC AACAAACATT TTTTTAAGTG CAGACATATT 114 0 

ATTATCTAGT CTAATCAAAC CTATAGTACC AACAATATTT TGaTGATTGT TTATTGCAAG 1200 

CCAAAATgCC CTCCATTATT CAAATAGTTA TGTTCGATGT TCTCCAAATC AGGTTGATCA 1260 

10 

TCTCTATCAA TTTTTATATa AATTCATTTT TTTGAATCGA TAAAATAAAC TCGATTAGCT 1320 

CTTCCTTATA AGACCTATTA TATTCAATTA TGTTTATAGC CATTTTTATC TCCTTTTTCA 13 80 

TTTAATTTAA TTATAAAATG TGCGTTTAGT TTGTATCTAG TGTACTCAGT ACAGCCTCAA 144 0 

ATGAAGTTTC ATTCCACTTG GCACTTAATA AAGACAAGTA TTTTAGCAGT AATACAATAA 150 0 

AGTCCAATAA ATTTCCCTAA CTTCAATATC CACTTTTTAA AAAATGTATT TTTAATTAAT 1560 

20 AAAAAAACTC TCCCCAATTT CTATGGGAAG AGCTATATAT TTAATGTCTA AACATTACTT 162 0 

TTATTTATTA TGAAGGAATT AGAATCCCCA AGCACCTAAA CCTTGTGCTT TGTATGCTTT 168 0 

AACAGCTGCG TTGATTTGTT GGTCAACAGT GTTTGTTGGA CCCCAACCTG GCATAGTTTG 174 0 

25 GAATAAACCT GAAGCACCTG ATGGGTTGTA AGCATTTACT TGACCATTTG ATTCACGAGC 1800 

GATGATTGCA GCCCATGTAG AAGCTGAAAC ACCAGTACGT TGAGCCATGA TTTGAGCTGC 1860 

TGATGAACCA GTAGCACCTG CAGTATTACC ATTGCTTAAT CTCACTGAAC TTGAAGTAGT 1920 

^ TGAAGTGCTG TAGTTATGGT AAGTTGGAGC TGAAACAGCT TCAACGTtTG AGTTACTTGA 1980 

TTGTGCATTG TAGCTTACTG ATTGTACATT TGAACCTTGG TTGTATGAAG TAGTGTAGTC 204 0 

TGCACCTGCA ACGTTTGAGA AACCAGCAGT TTGACCATTA GCTGCTTCAT AGCTCCATGA 2100 

36 

CCATGTAGTA CCATTTGAAG TGAAGTTATA TTGGAAACCA TCTTTTACAA AGTGGATGTC 2160 

ATATGCACCA TCTTTGATTG GAGCTGCATT TAATTGATCT TGGTGATTAT GCGCTAAGTC 22 20 

AACTAAGTGT GCTTGATCAA CGTTTACTTC AGCAGCGTGT GCTTGATGTC CTGTACCTGC 22 80 

TGCGTAACCT GTTACACCTA ATGCCACTGC TAATGATGAT GCCATAATTG TCTTTTTCAT 2340 

AGTAAAAAAT CCTCCAGTAA TAATTGTnAG TTTATGTTTT TAGTAATTAT AtTTTGjiATT 24 00 

45 TGAATGTCGT AGTgCAAGTT TAAATTGTCT TTTATTTCTT TCaACGGTAC TCACTATATC 246 0 

ACAaAAAACC AGCCAGTAAA TTACACTTTC TTTACAAAAC ATTACAATAT CAAGTGTTAT 2 52 0 

TTGtAATGTT GAAATATGGC TGTTTTATAC TGTAATGTGA AATATGTGCC CTTTAGAATC 25 80 

50 CAATCAACCC TTGAAATAGT CTTTAACACA TAAGATTTTT ACTATATTTA GCTCAACTAT 2S40 

TACAGCTTTC GTAATATTAC AGATTGTATT TTTGTTACAT AGCTGTAATA TATCTGACAT 2700 
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TACACATGTA TTGATTGCTA TTATTGTTGT ATATTCAAAG TTTTAAAACA CACATCTTTT 2 82 0 

GTGAATTGTC TTATCTTTTA TTAGCGCAAA TAAACTGCAG CTCAATTATA TTGTTCAACT 2 8 80 

TCATTCTCGC AATTCACAAT AACATTAAAT AATTTTTGGT CTCATATTTT CAAAAAACAT 2 94 0 

ACTGTTATTA TCCCATGAAT TTAAAAATAT CATTAGTATA TAAACGAAAC ACTTTACGAT 3 000 

AAATGATATC TGCAAGCCAA GCTGTTACAA ATGGTACAAC AAAGAACGCT ACTACAATTA 3060 

GTAAGACACT CAACCAAGCA GAATCAACCT CCATAAATTT AAATGCATTA ATCGGTCCTA 3120 

CCATTCCTAT AAAACCAAAT CCAGCTGACT CTTTCGTTCC ATGAATACCT ACTAATGCTG 3180 

ATACCAAACC TGATACAATG GCTGTCGTTA ATATTGGTAA CATAAGAATT GGATATTTCA 324 0 

CCATATTAGG TATCATCATT TTAACGCCTC CAAAGAAGAC GGATAACGGC ACCCCTAAAC 3300 

GATTCACTTT ACTTGTACCA ATTATCAATA CTGCTTCAGT CGCGGAGATA CCAATTGACG 3360 

CTGATCCAGC TGCTAAACCT GTAATACCTA TCGCAAAGGC AATGGCCACA GTTGATAGTG 3420 

GCGAAATAAT AATAAGACTA AATACCATTG AAATCAAAAT ACTCATGACA ATCGGTTGTA 34 80 

ATTCTGTAAA ACCATTAACC ATATTACCGA TGGCTGTTGT AATCATTTTC GTATACGGCA 3 54 0 

ATATTAAAAC ACCAATTGCA CCTGAAATAC CGCCAACAAC TGTTGGGAAT ACAATCAATG 3 6 00 

CCATACTACC TACGCGATGT TGAATAAGTA AAATGAATAA CACTGCAATC GCTGCTGTAA 3660 

TCATTGTATT AATTAAATCA CCAATACCCG TAATCATCCA AGCACCATTT TTAAACTGCG 3 720 

CTGCACCGCT TCCTACATAT GCTGCACTTG CCACAACAGC AATTGCTAAT GGCGATAGGT 3780 

CAAATTTCAT GGCAACCAAT GCACCAATCA AAGCAGGTAC TGTAAATTGA ATTGCAACGA 3 840 

CAACGCCTAA TAACGTTTTA AAAATCGGAT GATAATCCAT AAAGTATTTA AAAATTTCTC 3 900 

CAAGTATCGC ATTAGGAACT AAACCCGCAA CAATACCTAT GGCGACACCT GATAAAACTC 3 960 

TAAMTATAAA ATCTTTGGGT GTAATTGTTT TAATTGATGT CATAATATCA TCCTTCCATT 4 020 

TATGTATATA CATCTGTATG CAAATAATAA AGAGCCTTAA GTTATAAGCT GCCACTAGCT 4 080 

TAAATTCTAA GATGTGCATG CCGATGTTGT TATATTTAGG CTAGCAGTAT CATCTATAAC 4140 

TCAAGACTAT GAAAAATAGT ATATCACAAA ATTCTGAATT TTTAGATAAA TAAATTGGCA 4 2 00 

ATTTTTCAAA CATATTGTTA CAATACACTT TTATTTTATC TTCATTTTTA AAATGCATTA 42 60 

ATACAATAGA AGAAAGACAT TCAAATGCTT ACCAAAAAGG TACATTATTT GTTAGGAGCG 4 320 

TATCAGCaCT TACATATCAT CAACACAATT GACAATATAA TAGAAGATAC TGATAATAAG 4 3 80 

TGTTAAAACA ACAGATGTTA GGTAGTGAAC AAATGATGGA AAGTAAATCC ATAGATCC7VA 4 440 

GAATCGTTAG AACCAAACAA TTGCTTGTCG ATGCTTTTCT TAAAATTTCT AGAGAAAAGA 4 500 
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TTTACGCTCA TTTCGCTGAT AAAGAAGACC TCCTAGACTA CACATTATCT GTAACCATTT 4 620 

TAAAAGACTT GAATGATAAT TTGAGCATTT CTAATGTCAT TAATGAAAAG GTTCTGCGTA 4 680 

5 

ATATTTTCAT TTCAATTGCG AGTTATATCA AAGATGCTGC AAAGTCTTGC GAATTAAATA 4 74 0 

GTGAAGCATT TTGCAACAAA GCACATCAAC GTATTAATAA TGAATTAGAA GATATTTTTG 4 800 

CGATTATGTT AGAAAACAGC TATCCGGAGC ATCAACGAGA TATCATTGTA AATAGTGCGA 4 860 

GTTTTTTAGC AGCTGGTATC TCAGGCTTAG CATTACATTG GTTTAACACG AGTCAAGAGA 4 920 

CAGCCGATGT GTTTATCGAT CGCAACCTTC CATTTTTAAT TCATCATATA GCACATTTTT 4980 

,5 AATAAAACTT GGTATTTAGT CATGCATCTT GAAATCACTA TGTGACTTAG GTTCATACTT 504 0 

GTACACACAA TAAAATTTAA CGTATTACGA TTGATTAGCC GTGTCTAGGA CATAAATCAA 5100 

CGTCCTATAC TCTACAATGT CATATTAGCA GTCGTTAACT GAATGAAAAT AAGCTTGTCA 5160 

20 TTAAAACATA TAGATTTTAG TGACAAGCAT TTTT G TTTTr GCGTACTTAA ACAACACTTC 5220 

AGGCAATATG TTGTTTAGGC AACAAATGAT ATGTGCGTGT TTATTGGCAA ACGTACGACA 52 8 0 

TAGTAGTATA GTATGTCTAA ACAACATATG TTGCATAGTT GATATGCGTT GTTTAAATAC 534 0 

TAAGATAGGA GGGATTGACG TGAGCGAGAC AGATGAACCT CAGGGGTTTG AACGCACGCA 54 00 

TAATATATTA AATATTAATC AGAGTAGTCT GGGTGTAGTG ACATACATTA CAAATAAATT 54 60 

AAAGTCGACG TTGAAGCAAC ACATAATAAT TGCTCGTGGT AAAAAGCGAA TCGACTATCG 5520 

30 

ACTGTCGTAT AACTTTTACA TACGTATTAT GATAATGTAG AAATCAAGAA AATCGACTGT 5580 

GAATATACCT ATGCTATGCC CATTGCAATT TTAATAAGAC ACACGATGTC ATTCGACAAT 564 0 

GCTCATTTCT TTGCTCAGTT ACGTCATCCT GTCTTATAAA ACAACATTGC AGACATGTAT 5700 

3S 

ATCAAACGAC ACTTCAATAA CATCACTTTG CCcATCGTAC TACTAGTAAA ATCGTGTCTC 576 0 

AAATeCCTTA TTTTAATTCC AAAAAtCTGC TGGTCAAAAG ACCGAGAAAC TAAAAACATT 5320 

ACTTAATGTG TTGATAAATT ACCATATAAA AATAATCTCA AAATATATCA ACACTTGATT 5880 

CTAAGGAGGA TATGACAATA TGAAAATTTT AGATAGAATT AATGAACTTG CAAATAAAGA 5 94 0 

AAAAGTACAA CCACTTACTG TAGCTGAAAA ACAAGAACAA CATGCATTGC GTCAAGAcTA 6 000 

45 CTTAAGcATG ATCCGAGGAC AAGTATTAAC AACATTTTCC ACAATAAAAG TGGTTGATCC 6 0 60 

AATCGGTcAG GATGTCACAC CAGATAAAGT TTATGATCTT CGCCAACAAT ACGGTTATAT 612 0 

TCaAAATTAA tATTTGCTCA CGAGGTATTG CACTTAAGGT GCCAACTGAC CTCATAAACA 618 0 

^ AAGCCCATAC TGATTGAAGA CACTAATGTG tCsaCCATGG TGCACATTAC GCTTCATCTC 624 0 

TGTATGGGCT TTTTATTTAT TCTTTTGAGA ATTTCATTTT AGCAGACCAA AAAATTAAAA 6300 
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TGAACGACTG TGCCACCCGC TTCTTTCACT TTATTCACCA ACTGGTCAAC TTCTTCATTT 6420 

GTGTTCACAC CTAGAGAAAT CATCACTTCA TTTGGTTCAG TATTAAGGCT TTGCTGACTT 64 8 0 

5 

ACATTTTGAA AATGCTTGTn TTCTATTAAA ATTACGGkTG tTTGACCTAT tTGAATGCCG 6540 

ACCATTTTAT CTAACATTTG TGGGTTTCTA TTTATTTTAA ATCCTAACGC TTTATAAAAC 6 6 00 

TGTGCGCTCT TTTCTAAATC TTGCACATGC AAATTAAACC ACATTGATTG AATCATGATT 6 660 

to 

GCACCCCATT CATTACTTAT TATAGTTTTG GACTTTAAGC CAATCACTTA ATGATAATCT 6720 

TGTTGGATTT ATTTCAGCCA TTAATTCAAA GTCTACTTCA TAACCTTTTT CTTCCAACCA 67 80 

TTGCTTTTCT GCAACACCAC TAACAAATTC TCCTTCTATA ACAGTAGATT TACCTGTCAC 6 840 

TTCACTAAAA ATTGTTGCTG CTTCACTTAA TGTAACTTCA TCGGAACCAA TCTCTATTGA 6900 

TTGATGCGTA AAGCTTTGTG GATGTGCAAA AATATACGAT GCAATTTTAG CTATATCAAT 696 0 

?0 AGAAGAAATC ATTGTGAATT TTATATTCGG ATTAATAAAT TCTGGTAATG TAATACGTTC 7020 

ATCTTCGACT TTAGCAATGC GTAAAAAATT ATCCATAAAG AATGATGGTT TGATAACTGT 70 80 

TGCATTTATA TTAGATTCCA TTAATCTATT TTCTATTTTT GCTAGTACTT CAAAGTGTGG 714 0 

GCCAGTTCGA TTTCGATTAA CCCCTCCCGC AGTACTATAC ACAATATGTT GAATATTTTC 72 0 0 

TTGCTCAGCT ATTTCAATTA TCTTCATACC TTGTCTTAAT TCTTCGCTAA CATCATCTTT 72 60 

AACGATTGGC TGAATACTGT ATAAGCCATA CTTACCTTTC ATCGCTGATT GCAAACTAAC 73 20 

30 

ATTATCACTC AGATCACCTT CArCGATTGA TAAATGCGGA TGTCCTATGT CTGAAAGTTT 73 80 

ACGATTATnC TTATTTCTAG TTAATGCACT TACATACCAT CCATCCTCTA ACAACTGTTT 744 0 

TACAACTGCA TTACCTTGCT TCCCTGTTGC GCCTATTACn AAAATATCTT TCAT 74 94 

35 

(2) INFORMATION FOR SEQ ID NO: 70: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11802 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

AATTTATTTC GCCGTCCCAC CCCAACTTGC ATTGTCTGTA GAAATTGGGA ATCCAATTTC 60 

TCTTTGTTGG GGCCCcGCCC CAACTCGCAT TGCCTGTAGA ATTTCTTTTC GAAATTCTCT 12 0 

^° GTGTTGGGGC CCCTGACTAG AATTGAAAAA AGCTTATTAC AAGCGCATTT TCGTTCAGTC 180 

AATTACTGCC AATATAACTT CGTAGATCAT AGAACATTGA TTTATTTCCC AGCCTATTCT 240 
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AGCAAAGGTA ATAATGATAT TAATAATGTA CAAAAAATAT AAATCAAATC GACATCCTTA 360 

TAAAACATCA GAACCACTAA AAACAAAAAA GCACAAAATA AAATTAAATT TAAAATAAAC 4 20 

5 

GACCACTTTT CAAAAAAATC TCtTTTCaTa TTTCCACCCC TAATTTTAAT AAGCATTATT 4 80 

TTATATTCTC TTTTAAGTTT ATTATTCAAA AGGAAAACAG AAATATCTTT CaATATTATT 540 

ATAAACATTT CAACTACTTT TAAAAACCAA CAAAAAAATA CTTATTTTAA GTAGATGAGC 600 

ATAAGTGAAC ATAGTTCTTT AGTTATAATA ATTAATTCAA CCAAAAGTCG ATTTGTTTTT 660 

GCAATTGGTT TTCATTTCCT CTTAAAGATA TTTTCATTAA ATCTGTCAAA TCAATAGACG 720 

,5 CTATATTTTT CAACTTATCT CTATATTTAT TTTTAGTACG TCTTTCTAAA TTTCCCCATT 7 80 

CCTCTTCTTC GTGAGTTAAT AAATGAAGCA TTGCTCGTTC TTGTATATTT TCAATCATTT 84 0 

TTAAATTCGG TTTTAAAATA TGCAAATCAT CAAAACAATC TTTCCAACAA TCAACCATAT 900 

20 CTCGTTTTAA TTCAATTTCC ACACGCCATA GAAATGTTGA ATCAATTTCA ACATCTGCAT 960 

TATCTTTACG TTCTTGTTTT TATTATAAAT CCGAATAAAC CTATCACTAT TACGCACACC 1020 

AAAATATTTT GTTTCTGGTT TTACATTACG TCCATAAAAT ATAGTTTTCT TTACCGACTT 108 0 

ATCTGACAAT GCATAATAGT CATTTAAATC AAATTCAAAA TCAAAAGCCA AATCTAATCT 1140 

CGTAAAACTA ACATCGTCCA AATAACTGAT GATATTTTGT TTTAACCAAA GCACTTCATC 1200 

ATGCGAAAGC TTATTAGGAT TAAATTCAAC GCGCATAtAC GTCTATTCCA AAGAGTTGCT 1260 

TTTATTTTGT CATATTCAAT ATAAACTTTT TCTTTAAGAG CTTTAGCTTT AAAGTTTGTT 13 20 

TGTAAAATAT CCCAAAGCCG AATTTCAGGA TTAGTACTCA TAAAATGTGA AAGTCTCTCT 13 8 0 

35 GCGTTAGACA TGCTAAGATT CCCAACAATC GTTATAGCGT CAAAAGACAA TTTTGGAATA 144 0 

GCTAGTGACA TCCTATGTCG ATTTAACCGG CTATTACCGG ATATTAGAGT ATCCAGTTTT 1500 

ACAAATGGAT GAAACGAAAT TCAAAACACT AAAAAATATG TTCCACTAAC AGCAAAAAAA 1560 

TACCATTATG TTCCTACTAA AAAACyAAAA ATACTGGAGA ACAAAT6TCA GGATATAACT 1620 

TAGGATACTA TGTAATAAAA ATTTACAATA AAAAAACAGG AAAACAAATT TCAAGTAAAA 1680 

^ GmATACCCAT ACAAAGAGGA TAAAATAAAA AACCTCGAAC TGaAATGATG ATCTTTTCAG 1740 

CTCGAGGTTT AAATATTGGT GCCTTATTTA TATAGATTCG TTATATTATA TTCTCTATTT 1800 

TCATTAACinT AATCCTTAAA GAGTTTTAAA TTAATACCTG CTAGATGATT CAAAAATGTT 1860 

so TCATCAACTT TTAAATAATT CAATAATTTT TGTGGTGTCA GTAAATnTCT ATCAAAATAC 1920 

AACTTTAATA AACTATTCAT TTTGACAGGA CGTGACATTT CAATCACGTC GTCTAAAGAT 1980 

AATACTTTCT CXSCTTTAnAC AAAnACAAAA ACTTACCCGA TTAAAATCAA GTAAGTTTTA 204 0 
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TATTTGATAA AAAATCAATA AGTAATTGTG 
GCGCGTCGAT ATACATATCA TACTGACCAC 

5 

TTGTATATGT CTGCTTTAAA TCAACTGCGT 
GTCCCTTTGG TCTTCCAACA TGAATGGTAT 
AGTGTTGTGG TTTGGGTTCA AGGAAGTCTG 

10 

CAAAATATTC TGCTGATCGT TCAATGGCTT 
CTTTAAATGT ATTTGGAAAT GGGTAATTGT 

75 TGAAACCACT AGCAGAATCA AACAAAGCTG 

ATAAAGCGTA ATTCATAAAA TTTGTAAAAG 
GATTAATCGT CATATCATAT GGCAATGTAG 

20 

GCTTTCGTAA ATGTTGGTCA TCTTCATCAA 
GTAATTCACA TGATTCAACG GATAGATTTT 
CTACAGTTGT ACCTCTCGTA CCAGGTTGAA 

25 

TTTGTCGATG TTGGTGACCC GTAATAAAGA 
TGGCATATCC TTCATTTTCA CCCGTTAATA 

30 TTTCAAATCC ACCATGGTAA CAAACCACAA 

AGTATTGTTG AAGTATTTCA AAAGCACTAT 
GTTCCCAATG GGGAATAAAT TGTGTCGTTA 

35 CCTGAAAATA CTTCACACCG TTATCAGTCA 

ACAAAACTGG ATAATTGAGT CTGCGTAAAG 
ATTCATGATT ACXAAGCGTA CCAAAGTCGA 
GCTGGCTACT GCCGCTATGC GCGATTAAGT 
CACCATTATC TATTTTAAAA CTTTGGTCAT 

^ TCGCTAGTAA CAATCCCATA GGTTGATATT 

TATAACCATG TACGTCACTC ACGACATAAA 
TCAATCACAA ACATCTTTCT TATTTCTATT 

SO GGTTTTGTCA CCGAGTTTTA AACGAATCTT 

ATTGACCTTA ATTGTGACAT TTCCGTTTTC 
ACCTGGTGGG TTATAATCGT TATCTTTACT 

55 



CGCCTTCAAC TTGAATATCT TTTACAACTG 2160 

CGCCTACTGC ACGATAATTA TTTACACAAA 2220 

GACCTTGAAT CATCATATTG CTCACACGTT 2260 

AACTTACGCC ACCATATATA TCATAATTAA 2340 

CGCTCACACT AACTTCATCA TTTTTCACGT 24 00 

CTTTAAGTTT GGCACCACTT ACAGCTAAAA 24 60 

TAATAACATC TCGCATCGTC ACGACTTGCT 2520 

TACAGGCAAC ATCTGCGTCA CTTTTTTCTA 2580 

GATGCGGTGC CACACGTGCC TCAAATGCAT 264 0 

TAATTTCGTA ATCTAACCAG TCCTCTAACT 2700 

TAGTAAATGT GGAATCATCT ATAACAGGAA 27 60 

CATATTCATC AGTACTCAAG ACTACTCTGC 2820 

TCACAGCCGT TTGCTTAAAC CTTTCAGCAA 28 80 

TATCTATATC TTTAGAAAAC GCTTCTAACA 294 0 

CTTCGGTCGG CGTACCACTT TCTAAATCCT 300 0 

TGATATCTGC ATGTCGCTTC ATTTCAGGTA 3 060 

GAAACGTArT GnCnTGAATA TGCTCTGGTT 3120 

AACCTATCAC ACCAACAGTT TGATCTCCAA 3180 

ATGTACTATC ATTTTCATAT ATATTAGCGC 324 0 

TGTCTTTTAA GTATGGTAAT CCATAATTAA 3300 

ATGCCATTCG ATTATAAAAA TCAACTAAAG 3360 

AATTACAAAA TGGTGACCCT TGCAAAAAAT 34 20 

ACTGCCTTCT GTsTTGTTCT ATAACATGAT 34 8 0 

GATTTCTACT CGTAAAATCT GTTGGGAAAA 3540 

ATGCTATGTT TGACATCCTC ACTCACTCCT 3600 

ATATATTTAT TTGAAGTCTG TTGTAATCAA 366 0 

TGAACCTTCC ATACTTTCAA GTACTTTAGC 372 0 

ATCTGCTTTA ACTGTTGGCA AAGTACTGTA 3780 

TGAAAATTGT CCGATTTGAC GTCCGCCTTC 3 84 0 
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TATTGTCATT 
TTCGTTCTGA 
TTCGCTGCCA 
CGGTTGTGAC 
CAACTTAGTC 
TTTAGACATC 
ATCATATGTT 
CACAAGTGAT 
ATAGGGGCCT 
TTTACTTTTC 
AGTAATTGCT 
GAAACTTTCT 
AAAACGTACG 
AGAACTTTTT 



TCAAATGGCT 
TTTGGTGGTG 
TAACTACCTG 
CATGGCTCTT 
TTTGTATCTA 
CAAGCCGTTA 
TTCTTCTTTT 
GAATCACCGA 
TCTGCTTTAC 
GCTGGCAATT 
AATGTCGATC 
TTTGAAGAAA 
CCAAAGTTTG 
TCTTCTGCAT 



ACCATTCATT 
TGAAATAAAG 
TTCTTTGAAA 
TTCTGTTATG 
ATCCGCATAA 
ATGAGAACGA 
AGC-HWIGAT 
TGACTTATAT 
ATAGGATAAA 
ACTCTTAAAG 
GTACGCTATA 
AGTTATCAAA 
GCAGCAGCAT 
GATTCTAAAT 
AATGCTGGAA 



GCCTCCGATG 
ACAACATTGC 
GGAATGTTAG 
TTCGAATGAC 
TCTGAAAATG 
TCGAATAATA 
TGCGGTAAAG 
ATTTTTTTCA 
AATTCGTTGC 
AAAGTTATTA 
AGTGTAATTT 
TGAAAAATTT 
GTGGAAACTC 
CAGGAGGATA 
CATTAGAAGC 



CATTTACAGA 
TATGATCATC 
CTTTAAATGT 
TTTCAGGCTC 
ATGTTAGGCT 
TATTATTTAA 
CTCCATTATC 
TAAATGCTGC 
CGCCCCCATT 
GTTCTGGTGT 
CGGCATGCAT 
CAATATTGCT 
TAGATAACCA 
TCATACCTTT 
AATCAATACG 
CACCTTGTTt 
CCTCAGGAAT 
CATCAATAGC 
CACCATCACT 
CCGTTCGCTG 
CACTTAATGA 
TTTTGTGAGG 
ATATGAGCAA 
AAATTCATTT 
CATTGCATAC 
TAAGTGTTTA 
AAGTTCTTTA 
CAAACCTAAA 
TAAAGCAAAA 



AACATTTTGC 
TGGTGTGTTT 
TGTTGGATCA 
AGTTGAACGC 
ACTCGCCTTA 
TAGCTTACCG 
TTCTCTTACA 
TTTACCTTTT 
ATAAATACCT 
ATACACAATA 
AGAGACAGAT 
CGTATTTAAA 
ATCTGAACTT 
CGACATATCT 
ATTTAAATTT 
CACATATTTA 
TACAAATATT 
TTTAACGTCA 
AACCCAATCT 
TTGCTTTGTA 
TACCGTTGCA 
CTCCTTTTAA 
TTTAACGAAA 
TTATAAAATA 
ATATTACACG 
TTTGTATTAA 
GATAATCAAA 
GAATTAACCG 
CCATTAGAAA 



GGGATATCAA 
GGCTGAGGAT 
TACCATTTAT 
TCTGGTCGTT 
AGTGATTTCC 
TTGTCTTGTT 
TATTTGGGCG 
CCAACTTTAG 
TGATCTACAG 
CCTTTTGCTT 
TTCACACCTT 
TCACCTAGTG 
TTCACACCTT 
TCATATGCTC 
CGGTCAGCAT 
ACAATTGCTG 
TTGGAACTTT 
TAACCTTGTT 
GCAGCACCAG 
GGTTGCGATT 
ACAATTGCAG 
AATAAATTTG 
AATTTACAAA 
CTTTTTAACA 
ATTAAGAATG 
TGTTAGCAGT 
AGAACGCTAG 
TTCAATTTGT 
AATTACTATC 



ATGTTACTTT 
CTGCGCCTTT 
AACCACTCGG 
CAAAATCAAG 
CATCATTATC 
CTTTAAAACC 
AACTATCTTC 
AAATTGCTAC 
CATGTGACCA 
TCTCTGGATT 
CAGTAATACC 
CATTATATCG 
GCATTGCAGT 
CACGTCGATA 
TGTAATGATC 
CCTGTTCTGA 
TCAAACTTGC 
TTTGTATTGA 
CTGTTTGACC 
CATGCGTTAT 
AGACAGTTAA 
TTCTTGAATT 
ATCTTATCAA 
TTTAAATGTG 
TGAAGGGGAC 
CATTGTTTTT 
TAATGATTCG 
ACCTTCGCAA 
TAAAGAATTA 



3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 
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TCTAAAAAAG TTGATGTTGG TTTCTTACCA CCAACGGCAT ACACATTAGC ACATGATCAA 5 7 GO 

AAAGCAGCTG ATTTATTATT ACAAGCACAA CGTTTCGGTG TAAAAGAAGA TGGTTCAGCA 5620 

AGTAAAGAAC TTGTAGATAG TTATAAATCA GAAATTCTTG TTAAAAAAGA CTCAAAAATT 5880 

AAAAGCTTGA AAGATTTAAA AGGTAAGAAA ATTGCCTTAC AAGATGTAAC ATCAACTGCT 594 0 

GGATATACAT TCCCACTTGC GATGTTAAAA AACGAAGCAG GTATTAATGC AACTAAAGAT 6000 

ATGAAAATTG TGAATGTTAA AGGTCATGAC CAAGCAGTTA TCTCATTATT AAATGGAGAt 6060 

GTAGATGCTG CGGCTGTATT TAACGATGCA CGTAATACTG TGAAAAAAGA CCAACCAAAT 6120 

GTATTTAAAG ACACACGAAT TTTAAAATTA ACACAAGCTA TTCCGAATGA CACAATTTCT 6180 

GTAAGACCAG ATATGGATAA AGATTTTCAA GAAAAATTGA AAAAAGCTTT TATAGACATT 624 0 

GCTAAATCAA AAGAAGGTCA CAAAATTATT AGCGAAGTTT ATTCACATGA AGGATACACA 63 0 0 

GAAACGAAAG ATTCAAATTT CGACATTGTA AGAGAGTACG AAAAATTAGT TAAAGATATG 6360 

AAATAATCAT TATTTAACAA ATGAATCATT AGCGAATTTG GTATTAAAAG CTTTCGTTCA 6420 

ATAGATATAT TCTAGATTAA TATTGAAAAG CTAGGCGCTA AACTGAAACA GATATAGAAA 64 80 

GGTGTCGCTG TACATTTGAA ACCATTTGTA CACAGAAACC CAATGTCTAT GATATTTCAG 654 0 

TTTACCTTGG CTTTTCTTTA TTAAAGAAAG GTGTCAAACA TGAGTCAAAT CGAATTTAAA 6600 

AACGTCAGTA AAGTCTATCC TAACGGTCAT GTAGGCTTGA AAAATATTAA CTTAAATATT 6660 

GAAAAAGGTG AATTTGCAGT TATTGTCGGA CTATCTGGTG CTGGGAAATC CACGTTATTA 6720 

AGATCTGTAA ATCGTTTGCA TGATATCACG TCAGGTGAAA TTTTCATCCA AGGTAAATCA 6780 

ATCACTAAAG CCCATGGTAA AGCATTATTA GAAATGCGCC GAAATATAGG TATGATTTTC 6 84 0 

CAACATTTTA ATTTAGTTAA ACGGTCAAGT GTATTACGAA ATGTACTAAG TGGACGTGTA 6900 

GGTIATCACC CTACTTGGAA AATGGTATTA GGTTTATTCC CAAAAGAAGA CAAAATTAAG 6 960 

GCAATGGATG CACTAGAACG CGTCAATATC TTAGATAAAT ATAATCAACG CTCTGATGAA 702 0 

TTATCAGGTG GCCAACTIACA ACGTATATCT ATTGCACGTG CGCTATGCCA AGAATCTGAA 7080 

ATTATTCTTG CAGATGAACC AGTTGCTTCA TTAGACCCAT TAACTACGAA ACAGGTTATG 714 0 

GATGATTTAA GAAAAATCAA CCAAGAATTA GGCATCACAA TTTTAATTAA TTTACATTTT 72 00 

GTTGACTTGG CAAAAGAATA TGGCACACGC ATCATTGGTT TACGTGATGG TGAAGTTGTC 7260 

TATGATGGTC CTGCATCTGA AGCAACAGAT GACGTATTTA GTGAAATATA TGGACGTACA 732 0 

ATTAAAGAAG ATGAAAAGCT AGGAGTGAAC TAACATGCCT TTAGAAATAC CTACAAAGTA 7380 

TGACTCCCTT TTAAAGAAAA AGGTTTCTTT AAAAACGAGT TTTACCTTCA TGTTAATCAT 744 0 
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AATACCTCAA ATAGGTGATC TATTCAAACA AATGATTCCA CCTGATTTCG AGTATTTACA 7 56 0 

ACAAATTACA ACGCCAATGT TAGATACCAT TCGAATGGcT ATCGTAAGTA CAGTATTAGG 7 620 

TAGCATCGTT TCAATACCAA TTGCGTTATT ATGTGCTAGC AATATCGTTC ATCAAAAGTG 76 80 

GATTTCAATA CCCTCGCGCT TTATTTTAAA TATAGTTCGT ACTATTCCAG ATTTGTTATT 774 0 

AGCAGCAATC TTTGTGGCTG TATTTGGAAT CGGTCAAATT CCAGGGATAT TAGCACTGTT 7800 

TATTTTAACT ATCTGTATTA TTGGAAAATT ATTATATGAA TCATTGGAAA CGATAGATCC 7860 

AGGTCCAATG GAAGCAATGA CGGCTGTTGG CGCTAATAAA ATAAAATGGA TTGTTTTCGG 7920 

TGTTGTACCA CAAGCCATAT CGTCATTTAT GTCATACGTA TTATATGCAT TTGAAGTAAA 7980 

TATACGTGCT TCAGCTGTGC TTGGATTAGT CGGCGCTGGC GGTATTGGAT TGTTTTATGA 804 0 

TCAAACACTT GGTTTATTTC AATATCCAAA AACAGCAACG ATTATTTTAT TTACTTTAGT 8100 

TATCGTCGTC GTCATTGATT ACATCAGTAC GAAAGTGAGG GCACATCTCG CATGACACAG 8160 

GAAATAGCAA AATATAATGT TCACACAAAA GCACACAAAC GAAAATTGAT TAAAAGATGG 8220 

CTTATTGCAA TTGTCGTCTT AGCTATTATC ATCTGGGCAT TTGCAGGTGT ACCAAGTTTA 8280 

GAACTTAAAA GTAAATCATT AGAAATCTTA AAATCCATAT TCAGCGGATT ATTCCATCCT 8340 

GATATCAGCT ATATCTATAT ACCAGATGGC GAAGACTTAT TACGTGGTTT ACTTGAAACC 84 00 

TTTGCGATAG CCGTTGTAGG TACTTTCATC GCCGCAATTA TCTGTATTCC ATTAGCATTT 8460 

CTAGGTGCAA ATAATATGGT AAAGCTACGC CCAGTTTCAG GTGTTAGCAA ATTTATTTTA 8520 

AGTGTTATAC GTGTCTTCCC AGAAATTGTA ATGGCACTTA TATTTATCAA AGCTGTTGGC 8580 

CCAGGTTCAT TTTCAGGTGT ATTAGCTTTA GGTATCCATT CCGTAGtATG CTTGGGAAAC 864 0 

TTTTAGCTGA AGATATTGAA GGTCTAGATT TCAGTGCTGT AGAATCATTA AAGGCCAGTG 8700 

GTGCEAATAA GATTAAAACA CTCGTATTTG CAGTCATACC ACAAATTATG CCTGCCTTTC 8760 

TATCACTCAT ACTTTATCGC TTTGAACTAA ACTTACGTTC AGCTTCTATA CTGGGGCTAA 8B20 

TTGGGGCTGG TGGTATCGGG ACACCACTCA TATTTGCCAT TCAAACACGT TCTTGGGACC 8880 

GTGTAGGTAT TATATTAATC GGTTTAGTAC TAATGGTCGC AATTGTCGAT TTAATTTCCG 894 0 

GTTCAATCCG AAAACGTATT GTTTAACATT AAATCAGGAT ACTCCTAAAT AAGAAGTCCT 9000 

ACCGTCTTAC GTTTCTCTAT TATAATAAAA ACAGCAGTGA AGAAAACTAT TGTTATAGTT 9060 

AACTTCACTG CTGTTTTTAT AATATCTAAA TTTATTCTAT TTCAATTCCT TTAAATAACT 9120 

TTTACCGAAC TCTGGTAATG TTACGTTGAA ATTATCTGCT ATAGTTGCAC CGATAGAACT 9180 

GAATGTAGTA TCACTTTCTA GTGCATGACC ACCTTTAAAT TTCGGACTGT ACATAATTAC 924 0 
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TGTAATAATT ACTABATCGT CTTCTTTTAA GTTGCTAAAC AGTTCTGGCA AGCGATCATC 93 6 0 

GAAATCTTTA ATTGCTTGTG CATAACCTGG TTTATCACGA CGATGACCGT ATAATGCATC 942 0 

AAAGTCTACT AAGTTTAAGA AGCTAATACC TGTGaAATCT TTCTTAACAA TTTTCATCAA 94 8 0 

TTGATCCATA CCGTCCATGT TACTCTTCGT ACGAACCGCT TCTGTTACAC CTTCACCATC 954 0 

ATAAATGTCA TTAATTTTAC CGATGGCAAT AACATCATAA CCACCGTCTT TCAAATGATC 96 00 

TAAGACAGTT TTACCAATIAG GTTTTAACGC ATAGTCATGT CGATTAGATG TACGTGTAAA 9660 
GTTTCCTGGT TCACCAACAT ATGGACGTGC GATAATACGA CCAATTAAAT ATTTAGGGTC 9720 
TTTTGTCAAC TCACGAACCT TTTCACAAAT ATCATATAAC TCTTCTAATG GGATAATGTC 97 8 0 

TTCATGTGCA GCAATTTGCA ATACTGGGTC TGCACTTGTA TAAACAATTA AGTCACCAGT 984 0 

TTTCATTTGG TGCTCGCCCC ACTCATCGAT AATTTGCGTA CCCGATGCCG GTTTGTTAGC 9900 
AACAACTTTA CGACCTGTCA TTTCTTCAAT TTGTTGAATT AACTCTTCAG GGAATCCATT 996 0 

AGGGTATACT TTAAAAGGTT GCATAATATT TAATCCCATA ATTTCCCAGT GACCAGTCAT 10020 

TGTATCTTTA CCAACTGAAG CTTCACTCAA TTTAGTATAG TATGCTTCTG GTTGTTCAAC 1008 0 

TGCATTTACT ACTGGTAATT TATCGATGTT CCCTAGACCT AACTTTTCAA GGTTTGGTAA 1014 0 

AGTTTGATCG AAACCTTCTA AGGTATGTCT TAAAGTATGT GAACCTTCAT CTTTAAAATC 10200 

AGCTGCGTCT GGCGCTTCAC CAATACCTAC TGAATCCATT ACGATTAAAT GTACACGATT 10260 

AAATGGTCTT GTCATAGCTA TCACTCCCAA AATTTATATA TATTAGTAAT CTGAATCTGC 10320 

TTCTAAACCT TGCATAATTT GAACACCTGC GCTCGCACCA ATACGTGTCG CACCTGCTTC 103 80 

AACCATTTTA TTGAAATCTT CTAAATTACG TACGCCACCT GATGCTTTTA CTTCTACATC 1044 0 

AGCACCTACT GTATCTTTCA TTAATTTAAC GTCTTCTGCA GTCGCACCGC CACCTGCAAA 10500 

ACCTGTTGAA GTTTTAACGA AGTCCGCACC AGCCGCTTTT GTTAATTCAC TCGCTTTTAC 10 560 

AATTTCGTCA TGGTCCAACA ATACCGTCTC AATAATCACT TTTACTGTGT GACCTTTCGC 10620 

AGCTTTAACC ACTGCTTCAA TGTCTTGTTG TACATCATCA AAAC6TCCAT CTTTTAATGC 1068 0 

GCCGATGTTG ATGACCATGT CAATTTCATC TGCACCATTT TGAATTGCAT CTTCTGTTTC 10740 

AAATGCTTTC GTTGCAGTTG TCGACGCACC TAATGGGAAT CCTATTACCG TACAAACGAG 10800 

CACCTCTGAA TCAGCTAGTC GCTCTGCTGC ATATTTAACA TGTGTTGGAT TCACACATAC 10860 

AGATTTAAAA TTGTATGctT TCGCTTCATC GATGATTTGA TCGATTTGCG TACGTGTTGA 10920 

CTCAGGCTTC AATAAAGTGT GATCTATATA TTTCTCAAAT TTCATACTTA CTACTCCTCG 10980 

TGTTATATAA TCTCTTTATT TAATTTTACT ATAAATACGA ATATATCTCG CGAATTTATA 11040 
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ATACTCATTA AACCTAAAAT AATTAAAATA ATACCGAAAT GTGAACTTAA TGCATCATTG 1116 0 

CCTGGGAAAT TTAATGCTTT AAAATCGATT AGAGCCGCAG CAATCGCAAT ACCTACAGAT 11220 

5 

ACCGCCACAT TAATAATTAA ATTATAAAAA CCAATAGCCA CACCTGTCAT ATTAAGATCT 11280 

ATTGTTTTAA TGGCTTCGTT AAGTAAAGGT GCATACATTA AAGCAAAGCT ACCTGCAAAG 1134 0 

AATATCATAG AAATGACXJAA GATTGAAATG TGATTACCTA CTGCAAATGC AGGTAAAATC 1140 0 

to 

AAGCTCAGTG CTATTAAAAT AATTGCTGTG ATAATCGCTT GTTTTGAATT CAGATATTCG 114 6 0 

CCGATTTTAC CACTTAGTGC ACCAACAATG ACTGCTACTA TATAACCCGG TACTAATAAC 11520 

15 AGTGATGTTG TGTCTAGTTG CAGATGATAA ATTTGCTCCA TTATGAATGG GAACGTAAAA 11580 

ATATAACCCA ATTGGATAGC ATACATTACA AATACTATAA ATAAAAATGA AGCATAACGT 11640 

TTATTTTGGA AAAATGATTT ATTTACTAAT GGACGTTGCG CATTTTTAAT ATATAGCGCA 117 0 0 

^° AAAACGATAA TCGCAATTAA GGCACCAATC ATATATAACC AATTAAAGTT CGTAATAAAC 11760 

AGCATGACTG TTGTAGCAGG GGATCCTCTA GAGTCGAnCC TG 11802 
(2) INFORMATION FOR SEQ ID NO: 71: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1196 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



35 


CTAAAGAAGA TGCGAAACAA GATGTTGATA 


AACAAGTTCA 


AGCTTTAATT 


GACGAAATCG 


60 




ATCAAAATCC AAATCTAACA GATAAGGAAA 


AACAAGCACT 


TAAAGATCGT 


ATTAATCAAA 


120 




TACFTCAACA AGGTCATAAC GACATTAACA 


ATGCGATGAC 


AAAAGAAGCA 


ATTGAACAAG 


180 


40 


CAAAAGAACG TTTAGCGCAA gCATTGCAAG 


ACATCAAAGA 


TTTAGTGAAA 


GCTAAAGAAG 


240 




ATGCGAAAAA TGATATTGAT AAACGTGTAC 


AAGCTTTAAT TGACGAAATC 


GATCAAAATC 


300 




CAAATCTAAC AGATAAGGAA AAACAAGCAC 


TTAAAQATCG 


AATTAATCAA 


ATACTTCAAC 


360 


45 


AAGGTCATAA CGACATTAAC AATGCGCTGA 


CTAAAGAAGA 


AATTGAGCAG 


GCAAAAGCAC 


420 




AACTTGCACA AGCATTGCAA GACATCAAAG 


ATTTAGTGAA 


AGCTAAAGAA 


GATGCGAAAA 


480 


SO 


ATGCAATAAA AGCCTTAGCT AATGCGAAgc 


GTGATCAAAT 


CAATTCAAAT 


CCAGATTTAA 


540 




CACCTGAGCA AAAAGCAAAA GCGCTCAAAG 


AAATTGACGA 


AGCTGAAAAA 


CGAGCACTAC 


600 




AAAACGTTGA GAATGCTCAA ACTATAGATC 


AATTAAATCG 


AGGATTAAAC 


TTAGGTTTAG 


660 



55 
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TTGAAGCAAC ACCTGAGCAA ATCCTAGTTA ATGGTGAACT CATTGTACAT CGTGATGACA 78 0 

TCATTACAGA ACAAGATATT CTTGCACACA TAAACTTAAT TGATCAGCTT TCAGCAGAAG 84 0 

TCATCGATAC ACCATCAACT GCAACGATTT CTGATAGCTT AACAGCAAAA GTTGAAGTTA 900 

CATTGCTTGA TGGATCAAAA GTGATTGTTA ATGTTCCTGT AAAAGTTGTA GAAAAAGAAT 960 

TGTCAGTAGT CAAACAACAG GCAATTGAaT CAATCGAAAA TGCGGCACAA CAAAAGATTA 1020 

ATGAAATCAA TAATAGTGTG ACATTAACAC TGGAACAAAA AGAAGCTGCA ATTGCGnAAG 1080 

TTAATAAGCT TAAACAACAA GCAATTGGAT CATGTTnAAC AATGGCACCT GGATGTTCCA 114 0 

TTCAGTTGAA GGAAATTTCA ACAACAAGGA ACAAGCGCCn GATTGGAACA ATTTGA 1196 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

CAATCGTTTC AACGCTATTA TCTTTAGACA ACAATTGTAA GCGTGTATGT GCAGTTTCTA 60 

AACAGTCTAT AATTCGAGTT CTTAATTCAG CTGGATCATC TTTAAAAATA AAATCCATCG 120 

CTGCAACTTT GTAGACAAAT GTTAAATAGG TAAGTTCACT GTGACTCGTA ACGAAAATAA 180 

TGTTACCAAC TGGGTCATGC TTACGAATTT CACTGCCTAA TTTGATACCA TTAATATCAG 240 

TTGAAAGTTG AATATCTAAA AAGTAACAGC CTATGTCATT CATATTTTTA GCTTGCTCAA 300 

GCACCTCATA AGGATTATCA GTTGCGAGGG CAATTTCCAT AGGCTTTTCT TCTATCATTA 360 

TATAATTTTT AATAATGGTA ACCATGTTTT CTCTTTGTTT TGGATCGTCT TCGCAAATGA 420 

AAATTTTCAT ACATTCACAT CCTTATGGCT AGTTGTTAAT AATTTCAACT TTTTGAATAA 480 

AGAAACCATT TTCGATAATT GTATCTAATA AGACATTGTC TGCATTATCA GCAATTTCTT 54 0 

TTAAAGTTGA TAGACCTAAA CCACGACCTT CACCTTTAGT AGAAAAACTT TCTTGGAACA 600 

ATTCATGAAT GCGTGGTATA TCATCAGCGC ATTTATTCAT AACAATAAAC GTTACTGAAT 66 0 

TTTCACTTTC AATAAATGCA ACGCGAATGA TAGGGTCATC AATTTCAGTT GATGCCTCAA 720 

TTGCATTATC AAGAATAATA CCAATACTGC GACTTAAATC GATCATATTC AAGTTAATGC 78 0 

TACTTACTTC ATCGGGTATT TCGATACTAA TCGGAATATT CATTTCTTGT GCACGTAAAA 840 

TTTTCGCAGT AATTAAGCCT TTAATTTCAC GTACTTTAAG ATTCTCGATA CCATTTAATT 900 
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GTAGGCCAGG CATGTCATCT TCTCGAATGT ATTCTGAAAG TGTCGTTAAG ATATTGACAT 102 0 

AATCATGACG GAACTTGCGC ATTTCGTTGT TGATAGCTTC AATCTTCAAT GTATATTCAT 1080 

AATAGGTTTC AATTTCTTCT TGATTACGTT TATATTTCAT CTCTTTAAGG AGAAATTGAG 114 0 

AAATAACAAA TGTTAATATA CTTAAAAATA TAGTGATACC AATAAAAATA AAAGAATACT 12 00 

GCCTTATTAC TTTAGCTTCA TCCGAGTTTA TTTGTGAATA AAAGAAAAAT AATGAAAAAG 1260 

TAAGCAGTAA GATAGTCGAA ATAACTATTA AAAATCCTTT GTTTAGTATT AGATATGGTG 1320 

TGCTAATTTT TTTGAGAACT CTATTTATTA TATATGAGAA TAGTATACTA ATAGTCACAT 13 8 0 

AAACTACAAA AAAGCTAGGG AATATTACAA ATATACTATC AGAAATTTTG GTGGATATAT 144 0 

GCATATATAA CTATATACCT GTAGTTAGCA CnGTnATAGG AATAATCnGG CGAGGTCCAT 15 00 

AATCCACCAA AATAGAATA 15^9 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

GTAGGAATCT CTTTGTCTTT TTGGGAGGAC ATTTAATATG AATGTATATT TAGCAGAATT 60 

CCTAGGAACT GCAATCTTAA TCCTTTTTGG TGGTGGCGTT TGTGCCAATG TCAATTTAAA 120 

GAGAAGTGCT GCGAATGGTG CTGATTGGAT TGTCATCACA GCTGGATGGG GATTAGCGGT 180 

TACAATGGGT GTGTTTGCTG TCGGTCAATT CTCAGGTGCA CATTTAAACC CAGCGGTGTC 240 

TTTAGCTCTT GCATTAGACG GAAGTTTTGA TTGGTCATTA GTTCCTGGTT ATATTGTTGC 300 

TCAAATGTTA GGTGCAATTG TCGGAGCAAC AATT6TATGG TTAATGTACT TGCCACATTG 360 

GAAAGCGACA GAAGAAGCTG GCGCGAAATT AGGTGTTTTC TCTACAGCAC CGGCTATTAA 4 20 

GAATTACTTT GCCAACTTTT TAAGTGAGAT TATCGGAACA ATGGCATTAA CTTTAGGTAT 4 80 

TTTATTTATC GGTGTAAACA AAATTGCCGA TGGTTTAAAT CCTTTAATTG TCGGAGCATT 54 0 

AATTGTTGCA ATCGGATTAA GTTTAGGCGG TGCTACTGGT TATGCAATCA ACCCAGCACG 6 00 

TGATTTAGGT CCGAGAATTG CACATGCGAT TTTACCAATA GCTGGTAAAG GTGGTTCAAA 6 60 

TTGGTCATAT GCAATCGTTC CTATCTTAGG ACCAATTGCC GGTGGTTTAT TAGGTGCAGT 720 

GGTATACGCT GTATTTTATA AACATACATT TAATATTGGT TGTGCAATTG CrATTGTTGT 780 
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CGAATCAATT TACTAAAATA AAAAGAAACG TAAATAGCAT AATTTAACAT GTTTGATTCA 900 

TGGATTATGC TATTTTTTCG CCAAAATTTA ACAGATTTTG TACAATGGGT TAGCGATTAT 96 0 

S TTTTTAATAA AGGAGATACT ACTAATGGAA AAATATATTT TATCTATAGA CCAAGGAACA 102 0 

ACAAGCTCAA GAGCGATTTT ATTCAATCAA AAAGGGGAAA TTGCAGGGGT AGCACAACGT 108 0 

GAGTTTAAGC AATATTTTCC ACAATCAGGT TGGGTTGAAC ATGATGCAAA TGAAATTTGG 1140 

ACATCTGTGT TAGCTGTAAT GACGGAAGTA ATTAATGAAA ATGATGTTAG AGCTGATCAA 1200 

ATTGCAGGTA TCGGTATTAC AAACCAACGT GAAACAACGG TTGTTTGGGA CAAaCATACT 1260 

GGCCGCCCAA TTTATCACGC AATTGTTTGG CAATCACX3TC AAACACAATC AATTTGTTCA 1320 

IS 

GAATTAAAAC AACAAGGATA TGAACAAACA TTTAGAGATA AGACAGGATT ACTTTTAGAT 1380 

CCGTATTTTG CAGGTACAAA AGTTAAATGG ATTCTAGACA ATGTTGAAGG TGCACGAGAA 1440 

20 AAAGCAGAAA ATGGCGATCT ATTATTTGGA ACGATTGATA CTTGGTTAGT ATGGAAATTA 1500 

TCaGGaAAAg CtGCGCATAT TACTGATTAT TCaAATGCGA GTCGTACATT AATGTTTAAT 1560 

ATCCATGATT TAGAATGGGA CGATGAGTTA TTAGAACTAt TACAGTACCT AAAAATATGT 1520 

^5 TGCCAGAAGT TAAAGCTTCG AGTGAAGTAT ATGGTAAGAC AATTGATTAC CACTTCTATG 16 80 

GTCAAGAAGT ACCAATCGCT GGAGTAGCTG GTGATCAACA AGCAGCATTA TTTGGACAAG 174 0 

CTTGCTTCGA ACGTGGTGAC GTGAAAAACA CATATGGAAC TGGTGGCTTC ATGTTAATGA 1800 

30 

ATACAGGTGA CAAAGCGGTT AAATCTGAAA GTGGTTTATT AACAACAATT GCTTATGGTA 1860 

TTGATGGAAA AGTAAATTAT GCGCTTGAAG GTTCCATCTT TGTTTCGGGT TCAGCAATCC 1920 

AATGGTTACG TGATGGATTA AGAATGATTA ATTCAGCACC ACAATCAGAA AGTTATGCGA 1980 

35 

CACGAGTTGA CTCTACTGAG GGTGTTTATG TTGTTCCAGC TTTTGTAGGT TTAGGAACAC 204 0 

CATMTGGGA TTCTGAAGCA CGTGGTGCGA TTTTCGGTTT AACACGTGGA ACTGAAAAAG 2100 

AGCACTTTAT CCGTGCAACT TTAGAATCAC TATGTTACCA AACTCGTGAC GTTATGGAAG 2160 

CTiATGTCAAA AGACTCTGGT ATTGATGTCC AAAGTTTACG TGTCGATGGT GGTGCAGTTA 2220 

AAAATAACTT TATTATGCAG TTCCAAGCAG ACATTGTTAA TACTTCTGTT GAAAGACCTG 2230 

45 AAATTCAAGA AACTACAGCT TTAGGTGCTG CATTTTTGGC AGGTTTAGCA GTTGGATTCT 234 0 

GGGAGAGTAA AGATGATATC GCTAAAAACT GGAAATTAGA AGAAAAATTC GATCCGAAAA 2400 

TGGATGAAGG CGAAAGAGAA AAATTATATA QAGGTTGGAA AAAAGCTGTT GAAGCAACAC 2460 

*° AAGTTTTTAA AACAGAATAA ACTTGTAGAT TAGACTTTTG TATAAACATT GTGATACAAT 2520 

CAATTTAAGT TAATATTTGA ATCGAGAAGC GAGAGATTTG TTCGAACATG TACAATTGAA 2580 
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GCATTGTCTA CTTTTAAGAG AGAACATATT AAAAAGAATT TAAGAAATGA TGAATATGAT 27 00 

TTAGTAATTA TTGGTGGCGG TATTACAGGT GCAGGTATTG CACTAGACGC GAGTGAAAGA 2 76 0 

GGAATGAAAG TTGCATTAGT TGAAATGCAA GACTTTGCAC AAGGAACAAG CTCAAGATCT 2 82 0 

ACAAAATTAG TCCATGGTGG TTTGCGTTAC TTAAAACAAT TCCAAATTGG AGTAGTTGCC 2 880 

GAAACTGGTA AAGAACGTGC GATTGTTTAT GAAAATGGGC CTCATGTTAC GACTCCAGAG 294 0 

TGGATGCTTT TACCAATGCA TAAAGGTGGA ACATTTGGTA AATTCTCAAC ATCAATTGGT 3000 

TTAGGAATGT ATGATCGTTT AGCAGGTGTT AAGAAGTCTG AACGTAAAAA AATGTTATCT 3060 

AAAAAAGAAA CTTTAGCTAA AGAACCATTA GTTAAAAAAG AAGGTCTAAA AGGCGGCGGT 3120 

TACTATGTTG AATATCGTAC TGACGATGCG CGTTTAACTA TTGAAGTTAT GAAGCGTGCT 3180 

GCTGAAAAAG GCGCAGAAAT TATCAACTAT ACTAAATCTG AACACTTCAC TTATGATAAA 3240 

AATCAACAAG TAAATGGTGT TAAAGTTATA GATAAATTAA CTAATGAAAA TTATACAATT 3 300 

AAGGCTAAAA AAGTGGTTAA TGCAGCAGGT CCATGGGTTG ATGATGTTAG AAGTGGTGAT 336 0 

TATGCACGCA ATAATAAAAA ATTACGTTTA ACTAAAGGTG TACATGTTGT TATTGATCAA 3420 

TCAAAATTCC CATTAGGTCA AGCAGTATAC TTTGATACTG AAAAAGATGG AAGAATGATT 34 80 

TTTGCAATTC CACGTGAAGG AAAAGCGTAT GTAGGTACTA CAGATACATT CTATGACAAT 354 0 

ATCAAATCTT CACCATTAAC TACACAAGAA GACAGAGACT ATTTAATCGA TGCGATTAAT 3 600 

TACATGTTCC CTAGTGTTAA TGTTACAGAT GAA6ATATTG AATCAACATG GGCAGGAATT 3 6 GO 

AGACCATTAA TTTACGAAGA AGGCAAAGAC CCTTCTGAAA TCTCTCGTAA GGATGAAATT 3720 

TGGGAAGGTA AATCAGGTTT ATTAACTATT GCAGGTGGTA AATTAACAGG CTATCGTCAC 37 80 

ATGGCTCAAG ACATTGTTGA TTTAGTATCT AAACGCTTGA AAAAAGACTA CGGTTT7VACA 384 0 

TTTAJSTCCAT GTAATACAAA AGGTCTGGCA ATTTCAGGTG GCGATGTAGG TGGTAGCAAG 3900 

AACTTTGATG CGTTTGTAGA GCAAAAAGTA GATGTAGCTA AAGGATTCGG CATTGATGAA 3 960 

GATGTTGCAA GACGTTTAGC ATCTAAATAT GGTTCAAATG TTGATGAATT GTTCAACATT 4 020 

GCGCAAACAT CTCAATACCA TGATAGCAAG TTACCATTAG AAATTTATGT AGAACTTGTT 4080 

TATAGTATTC AACAAGAAAT GGTATACAAA CCTAACGATT TCTTAGTTCG TCGTTCTGGT 414 0 

AAAATGTATT TCAATATTAA AGATGTATTA GATTATAAAG ATGCTGTCAT CGATATTATG 4 200 

GCAGATATGC TTGATTACTC TCCAGCTCAA ATTGAAGCAT ATACTGAAGA AGTTGAGCAA 4260 

GCAATTAAAG AAGCGCAACA TGGaAATAAT CAACCAGCAG TTAAAGAATA AtTAATTTGT 432 0 

ACAATCATAA ACTGGTGTCC TGTTTTAAGG GCATCAGTTT TTTTATACGA GATACATTAG 4380 
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GTTATTAAAG GTGTGAGATG ATGACTGAAA AACAATTTAA ATTAACTGTA CAAGATAATA 4500 

CGAATATTGA AGTTAAAGTG AATTTTACAG ATGTAGATTC AAAAGGAATT ATTCATATAT 4 560 

5 TTCATGGTAT GGCTGAACAT ATGGAACGTT ACGATAAATT AGCACATGCA CTTTCAAAGC 4 6 20 

ATGGCTTCGA TGTGATACGT CATAATCATC GAGGACATGG TATTAATATT GATGAATCAA 4 680 

CAAGAGGGCA TTACGATGAT ATGAAACGAG TTATCGGTGA TGCCTTTGAA GTAGCGCAAA 474 0 

W 

CAGTGAGAGG CAATGTTGAT AAACCATACA TTATAATCGG ACATTCAATG GGATCCGTTA 4 800 

TAGCTAGATT GTTTGTAGAA ACATATCCGC AATATGTTGA TGGTCTAATT TTAAGTGGTA 486 0 

CTGGTATGTA TTCATTATGG AAAGGTTTAC CAACCGTTAA AGTGTTACAA CTGATTACAA 4920 

15 

AAATTTATGG TGCTGAGAAA CGAGTTGAAT GGGTTAACCA GTTAGTATCA AATAGTTTTA 4 980 

ATAAAAnnAT ACGTCCATTA CGTACACAAA GTGATTGGAT TTCTAGTAAT CCAATTGAAG 504 0 

20 TAGATAaCTT TATTAAAGAT CCATATAGTG GaTTTAATGT GTCAAATCAA TTATTATATC 5100 

AAACAGCCTA TTATATGCTA CATACATCAC AATTAAAAAA TATGAAAATG TTAAaTCATG 5160 

CCATGCCTAT ATTATTAGTT TCAGGATATG ACGATCCTTT AGGTGATTAT GGTAAAGGGA 522 0 

25 TTTTAAAATT GGCGAATATA TATAGAAACG CTGGCATnAA AAATGTTAAA GTGAATCTTT 52 8 0 

ATCATCATAA ACGTCATGAA GTGTTATTTG AAAAnGATCA TGACnAAATT TGGGAAGACT 534 0 

TGTTTAAATG GTTGAATCAA TTTTATAAAA AATAAAGAAA GTGGAATTAA ATATGAATAA 5400 

^° AAATAAGCCT TTTATTGTAG TAATTGTGGG GCCAACTGCT TGCAG 5445 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2 56 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
^ (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TGGCTTGAAC TACGCCAATA AGTCCCCCTA GTACAAGAAT GAATACCATG ATATCGACCG 6 0 

45 CTTCTATCGT ACCTTCAACC ATGCTACTTG TTATTTGTTC TGGTCCAGCT GGATGTTGCT 12 0 

TTAATCTTTC ATAAGTATTC GGAATTGATA CCGGCTTATT AATTGCACCT GATTTAAATT 180 

GTTCAATCTT AATTTTAACC CCCATTTTGT CTAGTTCCTG TTGCGTACCC GGAACCTTTT 240 

^° TCACTTGGTT ATGAGGGTTA ACTATCTTTA GTTCTTGGGA TGAAGGTTCG TAAGAAAGTT 300 

TAGAATATGC ACCAGCAGGA ATAACCCATG TTGCTATAAC TGCAACAACC GTTAAAATGA 360 

55 
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TAATTGTATT TTCCACGGTT TCATCTCCTT CGACATTTAA CCTAGCATTT CTACCTTAAA 480 

GATTTTATAA ATATAAATTA AGAAAGTGCA CCCCGCATCA AAATAGAGGC ATTATTTTCA 54 0 

^ GGGGGTGCAC ATAAATAATA AAAATCATGC ATTTGACATA TAGTAATTGA AAAGCGTTTC 600 

AATTCAATTA CTTTTTAATC ACAGTACCTA CTTTACCCTC TAAGGCAGCA TCTAATTCAT 66 0 

TTAATGATGT TATAAGCACA CTTCCTTTTG GATTGTTTTC AATAAATGAT ATGGCTGCTT 720 

10 

CAATTTTTGG TAACATACTT CCTTTTGCAA ATTGATTTTC GTCTATATAT CGTTTTAATT 780 

CATCAACATT TGTTGTTTTC AAAGGCTGTT GGTTTTCAGT GTTAAAATTA ATATATACAT 84 0 

AATCAATTGC TGTTAAAATA ATCAATTGAT CGCATTGAAT ATTAGCACCC AACAACGCAC 90 0 

TTGTTTTATC TTTGTCTATA ACTGCATCAA TACCTTTAAA ACCATCATGT TGCTCTCTAA 960 

TTACTGGTAT ACCTCCACCA CCAGCAGCAA TAACGAGTGT ATCATTTTTA ATAAGTGTTT 1020 

20 TAATACTCTC TAATTCAATA ATAGAGATGG GTTGTGGTGA AGGAACAACG CGTCTATATC 1080 

CTCTTCCAGC ATCTTCAACA AATATAAATC CTTTTTCTTT TTGAATTTGT TCAGCTTCTT 114 0 

CTTTGTTGTA AAATAACCCA ATTGGTTTTG AAGGATTGTT AAATGCCGGA TCATTTTCAT 1200 

CAACTTCAAC TTGTGTCACT AGTGTTACCA CTTGTTTATC CATTCCAATA GAATGCAATT 1260 

CATTTTGTAA GCTTTCTTGT AATTGATAGC CGATGTAAGC TTGACTCATT GCGCCACATT 1320 

CAGCAAATGG AAATGCCGGA CCTTGGTTAT GTTCTGCAGC ATAGTTAAGT CCCAAATTAA 1380 

30 

TGCTTCCAAC CTGTGGTCCA TTACCATGAC TAATAACAAT CTCATGTCCT TTTGTnATTA 144 0 

AyCCTACTAA TGATTtCGCA GTATTTTTAA CAAGCTCGAG TtGgTyCTTG aGGTGATTTn 1500 

CCTAAAGCAT TACCACCTAA TGCTACTACT ATTTTCGCCA TCATATTCAC TTCCTTATAT 1560 

CATTTAAAAT TCACCCAATG TAGCAACCAT GaCTGCTTTG ATTGTATGCA TTCTGTTCTC 1620 

AGCTTCTTGG AATACAACTG AAGCTTTACT TTCGAATACT TCATCTGTAA CTTCCATTTC 1680 

40 TCGAATACCA TATTTTTCAA AAATTTGTTG ACCTATTTTC GTATCAGCAT TATGGAAAGA 174 0 

TGGTAAGCAA TGCTCAAAAA TAACATTTGG ATTACCAGTT TTATCCATTA TTTCTTTATT 1800 

TACTTGATAT GGTTTCAATA ATTCAAGTCG TTCTTTCCAT ACTTCATCAG GTTCACCCAT 1860 

TGATACCCAA ACATCAGTGT AAATTACATC CGAACCTTTT ACaCCTTGGT CaATATCATC 1920 

TGTGATTAAT ATGTTGCCaC CATTTTCaGC GGCAATATTT TTACAGCGAT TTAATAATTC 1980 

ATCTGTTGGA TTTAATTCTT TTGGACAAAC TAAATGGAAG TTCATACCCA TAATGGCAGC 204 0 

ACCTTGCATT AATGCATTTG CAACGTTATT ACGACCATCT CCAACATATG TAAAGTTAAT 2100 

ATCTGCATAA TCTTTTTTTA AGACTTCTTT TGCTGTTAAG AAATCAGCAA GAACTTGAGT 2160 

55 
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TTCTACTGTT CTTTGTGAAA AACCACGGTA TTCAATGCCA TCATACATTC CACCAAGCAC 22 80 

ACGTGCAGTA TCTTTAGTTG TTTCTTTTTT ACCCATTTGT GATCCAGTTG GGCCTAAATA 2 34 0 

AGTTACATTT GCACCTTGAT CATGCGCTGC AACTTCAAAT GCACATCGCG TTCTTGTAGA 24 0 0 

ATCTTTTTCA AATAACAGTG CAATATTTTT ATTTTTTAAC ATAGGCTTTT CAGTGCCAAT 24 6 0 

ATATTTAGCA CGTTTTAAAT CCTCGGAGAG TGTTAATAAG GTTCTACCTC TTGTCGTGAA 252 0 

AAGTCTAATA AAGTTAAAAA ACTTCTGTTT CGTAnATTTT TCATTAAnA 256 9 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

CCTGGAACCA TCCaATCGtG CaAATCtTGa AAGaGAATAC GCAACAACAA TTAAATGTAT 6 0 

TGGAACACTA TATTCCAAAT GACCATCCAG CACTCGTTGA ATTAAAAATA TGGGAACGTT 120 

GGTTACATAA ACAAGGTTAC AAAGACATCC ATTTAGATAT TACTGCGCAC CACCTAGATC 180 

CTATTACACA GGTTTATTTA TTCAATGTCA TTTTGCTGAA AATGAATCTC GAGTTTTAAC 24 0 

AGGTGGTTAT TACAAAGGAA GCATCGAAGG GTTTGGATTA GGATTAACAC TTTAAGTAAG 300 

GGAGTATGCA CAATGTTAAG AATCGCCATA GCCAAAGGAC GTCTAATGGA TAGTTTAATT 360 

AACTATTTAG ATGTAATTGA ATATACGACA TTATCAGAAA CATTAAAAAA TAGAGAACGC 420 

CAATTATTAT TAAGTGTAGA TAATATTGAA TGCATTTTAG TAAAAGGAAG TGACGTGCCA 480 

ATCTATGTGG AACAAGGAAT GGCAGACATA GGCATTGTTG GTAGCGACAT ATTAGATGAG 54 0 

CGCCAATATA ATGTTAATAA TTTGTTGAAT ATGCCTTTTG GAGCATGTCA TTTTGCGGTT 600 

GCAGCGAAAC CTGAAACGAC CAATTATCGT AAAATCGCAA CGAGTTATGT TCATACTGCT 660 

GAAACATATT TTAAATCAAA AGGTATTGAT GTCGAATTGA TTAAATTGT^ TGGCTCTGTT 7 20 

GAATTGGCCT GTGTTGTAGA TATGGTAGAC GGAATTGTCG ACATCGTTCA AACAGGTACT 78 0 

ACGCTAAAAG CGAACGGACT GGTTGAAAAG CAACATATTA GTGATATCAA TGCAAGATTA 84 0 

ATAACTAATA AAGCAGCTTA TTTTAAAAAA TCACAATTAA TAGAGCAATT TATTCGCTCT 900 

TTGGAGGTGT CTATTGCCAA TGCTTAATGC ACAACAATTT TTAAATCAAT TTTCATTAGA 960 

AGCACCATTA GATGAGTCAT TGTATCCaAT TATTCGCGAT ATTTGTCAGG AAGTTAAAGT 1020 
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TTTAGaAATT AGTCATGAmC AAATTAAAGC AGCATTTGAC ACATTAGATG AAAAAACAAA 1140 

ACAAGCATTA CAACAAAGTT ATGAAAGAAT TAnAGCATAT CAaGAAaGTA TtaAACAGaC 1200 

GaATCAACAG TTAGAAGaAT CAGTGGaGTG tTrTGaAATA TACCATCCmC taGaAAGTGT 1260 

CGGTATTTAT GTG 1273 
(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

GTTGATAAAT TAAAAATGTT TTTATCAGAT ATTCAAAGTT ACCAACAATA TAGTAAAGAT 60 

CATCCGGTGT ATCAGTTAAT TGATAAATTT TATAATGATC ATTATGTTAT TCAATACTTT 12 0 

AGTGGACTTA TTGGTGGACG TGGACGACGT GCAAATCTTT ATGGTTTATT TAATAAAGCT 180 

ATCGAGTTTG AGAATTCAAG TTTTAGAGGT TTATATCAAT TTATTCGTTT TATCGATGAA 24 0 

TTGATTGAAA GAGGCAAAGA TTTTGGTGAG GAAAATGTAG TTGGTCCAAA CGATAATGTC 30 0 

GTTAGAATGA TGACAATTCA TAGTAGTAAA GGTCTAGAGT TTCCATTTGT CATTTATTCT 360 

GGATTGTCAA AAGATTTTAA TAAACGTGAT TTGAAACAAC CAGTTATTTT AAATCAGCAA 420 

TTTGGTCTCG GAATGGATTA TTTTGATGTG GATAAAGAAA TGGCATTTCC ATCTTTAGCT 480 

TCGGTTGCAT ATAGAGCTGT TGCCGArAAA GAACTTGTGT CAGAAGAAAT GCGATTAGTC 54 0 

TATGTAGCAT TAACAAGAGC GAAAGAACAA CTTTATTTAA TTGGTAGAGT GAAAAATGAT 600 

AAATCATTAC TAGAACTAGA GCAATTGTCT ATTTCTGGTG AGCACATTGC TGTCAATGAA 660 

CGATTAACTT CACCAAATCC GTTCCATCTT ATTTATAGTA TTTTATCTAA ACATCAATCT 720 

GCGTCAATTC CAGATGATTT AAAATTTGAA AAAGATATAG CACAAATTGA AGATAGTAGT 780 

CGTCCGAATG TAAATATTTC AATTGTGTAC TTTGAAGATG TGTCTACAGA AACCATTTTA 84 0 

GATAATGATG AATATCGTTC GGTTAATCAA TTAGAAACTA TGCAAAATGG TAATGAAGAT 900 

GTTAAAGCAC AAATTAAACA CCAACTTGAT TATCGATATC CATATGTAAA TGATACTAAA 960 

AAGCCCTCAA AACAATCTGT TTCTGAATTG AAAAGACAAT ATGAAACAGA AGAAAGTGGC 1020 

ACAAGTTACG AACGAGTAAG GCAATATCGT ATCGGTTTTT CAACGTATGA ACGACCTAAA 1080 

TTTCTAAGTG AACAAGGTAA ACGAAAAGCG AATGAAATTG GTACGTTAAT GCATACAGTG 114 0 
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GATGGATTAA TCGATAAACA TATTATCGAA GCAGATGCGA AAAAAGATAT CCGTATGGAT 12 60 

GAAATAATGA CATTTATCAA TAGTGATTAT ATTCGATATT GCTGAAGC 1308 
^ (2) INFORMATION FOR SEQ ID NO; 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1431 base pairs 

(B) TYPE: nucleic acid 

'0 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

,^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

GATGCCATTn ATnnGTATGC AAGAAGTTGT TCCGGGTTCA GGTGGATTaC CAGTTGGTAC 60 

TGGTGGTAAG ACGTTACTAA TGCTTTCAGG CGGTATAGAC TCACCAGTTG CTGGGATGGA 120 

20 AGTGATGAGA CGTGGCGTAA CAATTGAAGC GATTCATTTC CATAGTCCAC CATTTACAAG 180 

TGATCAAGCA AAAGAAAAAG TTATTGAATT GACACGTATT TTAGCTGAAC GTGTTGGACC 24 0 

AATTAAATTG CATATTGTAC CATTTACAGA ATTGCAAAAA CAGGTAAATA AAGTTGTACA 300 

TCCAAGATAT ACAATGACTT CAACGAGACG TATGATGATG CGTGTTGCTG ATAAATTAGT 36 0 

ACATCAAATA GGGGCTTTAG CTATTGTAAA TGGTGAAAAC CTAGGGCAGG TAGCCAGTCA 420 

AACACTTCAT AGCATGTATG CAATTAATAA TGTAACTTCT ACTCCTGTAT TACGTCCTTT 4 80 

30 

ATTAACTTAC GATAAAGAAG AAATTATTAT TAAATCGAAA GAAATTGGTA CATTTGAAAC 54 0 

ATCTATTCAA CCATTTGAAG ATTGTTGTAC AATTTTCACC CCTAAAAATC CAGTAACCGA 600 

ACCAAACTTT GATAAGGTAG TCCAATATGA AAGTGTCTTT GATTTTGAAG AGATGATTAA 660 

TCGTGCTGTT GAAAATATTG AAACACTTGA AATAACTAGT GATTATAAAA CTATTAAAGA 720 

ACAQCAAACA AACCAATTAA TAAACGACTT TTTATAAATA AAATCCTAGA GTAAATTTAA 7 30 

40 ACATAAGGGG ATGTTAAACT ATGGATTTGA ACTTAACGAT GATTATAATC ATAATTTTAT 84 0 

TTGGTTTTAT CGCGGCGTTT ATAGATTCGG TTGTAGGGGG TGGCGGTTTA ATTTCTACGC 90 0 

CAGCATTATT AGCAATCGGT CTACCACCAT CTGTGGCTTT AGGTACAAAT AAATTGGCAA 960 

45 GTTCGTTTGG TTCTTTAACT AGTACGATAA AGTTTATAAG GTCCGGTAAA GTGGACTTAT 1020 

ATGTTGTTGC CAAATTATTT GGTTTTGTAT TTTTGGCATC TGCATGTGGC GCATATATTG 1080 

CAACGATGGT TCCGTCACAA ATATTGAAAC CTTTAATCAT CATTGCACTT TCGTCGGTGT 114 0 

so 

TTATATTCAC ATTACTTAAA AAAGATTGGG GCAATACACG CACGTTTACT CAATTTACAT 1200 

TTAAGAAAGC CATAATATTT GCAGCACTTT TTATATTAAT CGGCTTTTAT GATGGATTTG 1260 

ss 
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TAAGTGCAGC AGGAAATGCT AAAGTTTTGA ACTTTGCTTC TAATATAGGT GCGCTTGTAT 13 80 

TATTTATGGT ATTAGGACAA GTAGATTATG TAATAGGTTT AATTATGGCT A 14 31 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 03 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



,5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AATATTATTT TAAATTCAAT ATTTATTGGT GCATTTATTT TAAACTTATT ATTCGCCTTT 60 

ACCATTATTT TCATGGAAAG ACGTTCTGCC AATTCTATCT GGGCTTGGTT ACTAGTCTTA 120 

20 GTTTTCTTGC CTTTATTCGG CTTCATTTTA TACTTACTAT TAGGACGACA AATTCAACGT 18 0 

GACCAAATTT TCAAAATTGA TAAGGAAGAT AAAAAAGGAT TAGAGTTAAT CGTTGATGAG 24 0 

CAATTAGCTG CTTTAAAAAA TGAAAACTTT TCAAATTCCA ATTATCAAAT TGTAAAATTT 300 

25 

AAAGAAATGA TTCAAATGTT GTTATATAAT AACGCAGCAT TTTTAACAAC AGACAACGAT 360 

TTArrrrtAT ACACAGACGG CCAAGAAAAA TTTGATGACC TAATACAAGA CATCCGTAAT 420 

GCTACTGATT ATATTCATTT TCAGTACTAT ATTATTCAAA ATGATGAATT AGGTCGTACC 4 30 

30 

ATTTTAAATG AACTTGGTAA AAAAGCGGAA CAAGGTGTAG AAGTTAAAAT TCTTTATGAT 54 0 

GACATGGGTT CTCGTGGACT GCGTAAAAAA GGCTTACGCC CGTTTCGCAA TAAAGGTGGA 600 

35 CATGCTGAAG CATTTTTCCC ATCAAAATTA CCTTTAATTA ACTTGCGTAT GAACAATCGA 66 0 

AACCATCGAA AAATTGTTGT AATAGATGGG CAAATTGGAT ATGTTGGTGG TTTTAATGTT 72 0 

GGTGATGAGT ACTTAGGTAA ATCAAAAAAA TTCGGCTATT GGCGAGATAC GCATTTACGA 7 80 

40 ATTGTCGGGG ATGCAGTGAA TGCATTGCAA TTACGATTTA TTCTAGATTG GAATTCACAA 84 0 

GCCACACGTG ACCACATCTC CTATGATGAT CGTTATTTCC CAGATGTAAA TTCTGGTGGA 900 

ACAATTGGCG TTCAAATAGC TTCTAGTGGT CCTGACGAAG AATGGGAACA GATTAAATAC 960 

''^ GGCTATTTGA AAATGATTTC ATCTGCTAAA AAATCGATTT ATATTCAATC TCCCTATTTC 1020 

ATACCTGATC AAGCCTTTTT AGATTCTATT AAAATTGCGG CATTAGGTGG TGTTGATGTC 1080 

AATATCATGA TTCCTAATAA ACCTGACCAT CCGTTTGTTT TTTGGGCTAC TTTAAAAAAT 114 0 

so 

GCAGCATCCT TATTAGATGC CGGTGTTAAA GTATTTCACT ACGACAATGG CTTTTTACAC 1200 

TCAAAAACAC TTGTTATAGA TGATGAAATT GCAAGTGTGG GAACAGCTAA TATGGACCAT 1260 
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AAATTAAAAC AAGCTTTTAT AGATGATTTA GCAGTATCTT CTGAATTAAC AAAAGCACGT 13 80 

TATGCTAAGC GAAGTCTTTG GATTAAATTT AAAGAAGGTA TTTCACAATT ATTGTCACCT 144 0 

ATCTTATAAA ATAGAAATAT GAGGAGTGTA aCTTTAATGC AACAATCAGA CGTCATTAGT 1500 

GCTGCCAAAA AATATATGGA ATCTATTCAT CAAAATGATT ATACAGGCCA TGATATTGCG 1560 

CATGTATATC GTGTCACTGC TTTAGCTAAA TCAATCGCTG AAAATGAAGG TGTTAATGAT 1620 

ACTTTAGTCA TTGAACTCGC ATGTTTGCTT CATGATACCG TTGACGAAAA AGTTGTAGAT 16 80 

GCTAACAAAC AATATGTTGA ATTGAAGTCA TTTTTATCTT CTTTATCACT ATCAACCGAA 174 0 

GATCAAGAGC ACATTTTATT TATTATTAAT AATATGAGCT ATCGCAATGG CAAAAATGAT 1800 

CATGTCACTT TATCTTTAGA AGGTCAAATT GTCAGGGATG CAGATCGTCT TGATGCTATA 1860 

GGCGCTATAG GTGTTGCACG AACATTTCAA TTTGCAGGAC ACTTTGGTGA ACCTATGTGG 192 0 

ACAGAACATA TGTCACTAGA TAAGATTAAT GATGATTTAG TTGAACAGTT GCCACCATCT 19 90 

GCAATTAAAC ATTTCTTTGA AAAATTACTT AAGTTAGAAT CTTTAATGCA TACAGATACG 2 040 

GCGAAGATGA TTGCTAAAGA ACGTCACGAC TTTATGATGA TGTACTTGAA ACAGTTTTTT 2100 

ACGGAATGGA ATTGTCACGA CTAGACATTG AAGTTGTAGT ATGATGATGC GATGTAATGG 2160 

CGTGTTGTTG TGGAAGCTTG GTGTCATGCC ATGTTACTTT GATGTGTTGT TGTGGGAGCT 2220 

TGGTGACATG TCATGCTACT TTGATGTGCT GGTACCACGA TGCGTCTTGA TGTAGTGCTA 2280 

TGATGTGGCA TTGCGGTGTT ATGGTGTTAT AGACAGGTTT GGCGTTGATG CCATGTTACT 234 0 

TTGATGTGCT GGTACCACGA TGCGACTTGA TGTAGTGCTA TGATGTGGCA TTGCGGTGTT 24 00 

ATGGTGTTAT AGACCGGTTT GATGTTGATG CCATGTTACT TTGATGTGCT GGTGCTACGA 2460 

TGCGACTTGA TGTAGTGCTA TGATGTGGCG TTGCGCTGTT ATGGTGTTAT AGCCAGGTTT 2520 

GGTGXTGATG TCATGCCGTT ACGATTCTAT GATATGTTGT TGGGACGTTG CAATGTGTAT 2580 

TATGCCGTTG TGACGTTATT ATTTCACACT GTTACATGTA TAAGTGAATT GCTGTGGAAA 2640 

TTTGCGACAT ATACTGCTAC ACTGATGAAT CATTGTGTCA AGATGACATT GCGATGAAGA 27 00 

ATGACAACTC TGTTATTAAC CACTTTTTAC ATACTGAAAA CTCGTTAATA TTATTTCAAA 276 0 

TAAAAACAGC AGTAGGATGA CTTTCACATT TGAAATCATC TTACTGCTGT TTCTATTTAT 2820 

CACATATTGT ATAATGTGAC ACTAAGTTTC GCTATTGAAG CGAAAAATAA TGTGCGCCCT 2880 

ATAAAGTTAA AATTATCTTC AACTTTTAGG GTGCACATTA TTTGGACTTG CTAAGGTTAT 2 94 0 

TTCTTTTTCT TTTTAGACAC AACTTGTGTG TTTTTGCCTT TTTTATTGct GCCGCCGTTG 3 000 

TGCTCTCTTT CATACGCTTC AATGAAAGGT TGTACTTCTT TTTTAGCGAC TTTTTCATAA 3060 
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CCAAGTGCTG ATGCTGAGCT TAATGAAATC CAGATAATCA TAATTGGTGA AATGACCATC 318 0 

ATCATGTAAC CCATTTGACG TTGTTCGTCT GGCATCGTTT TACTTGATAC ATATGCTTGG 3 24 0 

ATAAAGTATA AAACACCGGC AATAATTGTA ATCCAAATAT CAGGACGTCC TAAATCGAAC 33 00 

CATAAGAAGT GTGGATATTT AAACAAACCA TCTACAAGTT GGTCTTTAAG TACAAAGTAT 3 36 0 

AATCCCATGA TGATTGGTAA TTGGATTAGC ATTGGTAAAC AACCCAACAT ACTCTTAATC 34 2 0 

GGGTTCATGT CATACTTTTT ATATACTTGC ATTAATTCTT GGTTTGCAGC CATTTTTTCT 34 80 

TCTTGTGTAC GCGnCaCGTT CACTTTTTCT TGAATTTTTT CAACTTCTGG CTTTGCAACT 354 0 

TTCATTTTTT GACGCATCAT ATGACTATTT TTATAGTTTG ACAACATGAA TGGTAATAAA 3600 

ATAATACGAA TTACCAATAC AAGGATAATA ATAGCTAAAC CATAATTGTC GTTTAATAAG 366 0 

TTATTTCCCA ACCAATCCAA TACATTTTTC ATTGGATCTA CGAATGTATT GTAGAAAAAy 372 0 

CwCtACGTTT TTCAGGTTTA GAATAGTCAC AACCAGCCAA AAAGACCATA ATACCTAAAA 378 0 

ATAATGGTAG TAACGCTTTT TTCTTCATTT TTCCACCTCT ATCATTATAT TCACATAGGA 3 84 0 

TTTATTCTAT CACATTAATG AGTACGTATG AAACAATAAG TGGAAAAATT TAACTAATTA 3 900 

TTAAAAAAAT CTTTGAATCG ATTAACAGTC TTTTCAATAT TTTCACTTTT AGAAATGGCT 3 960 

GAAATGACTG AAATTCCATT GGCACCTGCT TCTACAATCG GCGCCACATT ATTAGTATTG 4 020 

ATACCGCCAA TAGCTACAAT CGGTAGTTGC GGATTCATTT CTTTAAACGT TGCAATCATT 4 080 

TCTGGACCTA CTGGTATATG CGCGTCATGC TTCGACGGCG TAGGATAGAT TGGTCCAACA 414 0 

CCTATATAAT CmACATGAGT TAAATCAGAT TTTGCATACT CATCTAAATC ACTAATACTA 42 00 

AGTCCAATAA TTTTATCAGT GAAATATTGT GCTATCTCTT TGACTTTCGC ATCATCTTGA 4 2 60 

CCGACATGTA TACCATCCGC GTTAATTTCT TTTGCCAAGG ATACATCATC ATTAACGATA 4 32 0 

AAAGGCACAT CATATTGATG ACAGAGATGC TGTAATTCTT TAGCTAATAC AAGTTTATCG 4 3 80 

TTTCCTTTTA AAGCTGATTC ACC 4 4 03 
(2) INFORMATION FOR SEQ ID NO: 79: 

£i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1808 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
TGGAnCCAAT ATTAGAAATG ATTAAAACAT TAACAGGTAT TAATAGTCCT TCAGGAGnCA 60 
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TAACAAATAA 


AGGTGCGTTA 


TTAATAACAG 


TGCCAGGCAA 


AAATGATGAA 


GTACAACGCT 


180 


GTATTACTGC 


TCATGTTGAT 


ACTTTAGGTG 


CaATGGTTAA 


AGAAATTAAA 


GAAGATGGTC 


240 


GCTTaGCAAT 


AGAATTAATT 


GGAGGATTCA 


CGTATAACGC 


GATTGAGGGT 


GAATATTGCC 


300 


AAATTAAAAC 


TGATGCTGGT 


CAAATATATA 


CAGGAACAAT 


TTGTCTGCAT 


GAAACAAGTG 


3eo 


TTCATGTATA 


TAGAAATAAT 


CATGAAATAC 


CTAGAGATCA 


AAAGCATATG 


GAAATAAGAA 


420 


TTGATGAAGT 


AACTACATCA 


GAAGAAGATA 


CAAAGAGTTT AGGTATTTCA 


GTAGGTGATT 


480 


TTGTTAGCTT 


TGATCCACGT 


ACAGTTATCA 


CGTCATCAGG 


TTTTATTAAA 


TCTCGTCATT 


540 


TAGATGATAA 


AGCTAGCGTA 


CGgTtGATAC TACAATTACT AAAGAAATTA AAAGAAGAGC 


600 


AAATAATATT 


ACCACATACA 


ACGCAATTTT 


ATATTTCTAA 


TAACGAAGAA 


ATAGGTTACG 


660 


GTGCAAATGC 


ATCAATTGAT 


TCGAAAATCA 


AAGAATATAT 


TGCATTAGAT 


ATGGGCGCGT 


720 


TGGGAGACGG 


TCAAGCATCG 


GATGAATATA 


CAGTTTCTAT 


TTGTGCCAAA 


GATGCTTCAG 


780 


GTCCATATCA 


TAAGCAATTG 


AAATCGCACC 


TAGTTAATCT 


TTGCAAAATA 


AATAACATTC 


840 


CATATAAAGT 


AGACATATAT 


CCATATTATG 


GTTCAGATGC 


TTCAGCAGCT 


TTACATGCTG 


900 


GTGCGGATAT 


CAGACATGGT 


TTATTTGGCG 


CTGGCATTGA 


ATCATCTCAT 


GCAATGGAAC 


960 


GAACACATAT 


TGATTCTATT 


AAAGCGACAG 


AGAAATTACT 


ATATGCATAT 


TGCTTATCAC 


1020 


CAATTGAGTA AACAATTAGT 


GTTGACAAAT 


GTGaACGACC 


TATGTAATAT 


AATGAACTAT 


1080 


AAAAATAATT 


AGAATTTTCT 


AAAGAAATAG 


TAGCAGATAT 


GAAACGTAGC 


AAATAGAAAG 


1140 


CTAATGGGTG 


ATGGGAATTA 


GCACGCCATA 


TCTTGTGAAT 


TGGACTTTGG 


AAAACAATTG 


1200 


AATGAGTTTT 


GAAAGTGAAC 


ATGAATTATG 


TTAACTAAGG 


TGGCACCACG 


GTAACGCGTC 


1260 


CTTACAGGTA 


TATGCGTTAT 


GTGGTGTCTT 


TTTATTTAGA 


CAAAATGTAG 


TAGTTAATTA 


1320 


AAGGTAGCAA 


CAGAAAGTTA 


GTGGATGATG 


TGAACTAACA 


CCGAGATTAA 


TGAAATTGGG 


1380 


TTTTGTCTGC 


AACAGAAAAA 


TTATATATAG 


TAAAGAGTGA 


ACTATGAATA 


TTTCGAATAT 


1440 


TCGGTTAATT 


TAGGTGGTAC 


CACGCGTCAC 


nTCCTTTATA 


TTGATAAGGA 


TGCTGGCGCT 


1500 


TTTTTGAAAG 


GAGCGTATAG 


AATGGATATA 


TTTTATAAAA AAATAAAAGC 


AAATGTAACG 


1560 


CCCGAAGTTT 


TAGCACAACT 


TCATTCCAAG 


AAGaTCATTT 


TGGAAAGTAC 


AAATCAACAA 


1620 


CAAACTAAAG 


GTCGCTATTC 


AGTTGTTATT 


TTTGATATTT 


ATGGCACTTT 


AACTTTAGAT 


1680 


AATGATGTAT 


TATCAGTAAG 


TACTTTAAAA 


GAATCGTATC AAATCACTGA 


AAGACCGTAC 


174 0 


CATTATTTAA 


CGACTAAnAT 


AAATGAAGAC 


TACCATAATA 


TTCCAAGATG 


AGGCAACTTA 


1800 


AGTCATTA 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 80: 

10 

TGGTCGTCAA TTTCTTGATT ATATCTATAA TCCTCATTTT CAATATTAGA GTCTGTAGAA 60 

TCATCGATAT TATTATCATT CGCATGACTA GAAGCAGAAT CATTATTTTT ATCATTGCTT 120 

,5 TCTTCTTTTT TGAAGTCTTT ATTTATCAAG TAAATTTCTT CATCAAAATC AGCTTGTTGA 180 

GATGTATCAT CTTTATTTTG ATTAGAAAAA TGTGTAGCCT TTGATCTTTT TCTTTGCCGT 24 0 

CTTTTCTTAG ATGTATTCCT CGTAAATAAT TCTAATTCAT CTTTATCTTC ATTTGATTCT 300 

20 TGTTGATCGT TCTTCGTTTT ATCATCCATC AATACTCACA CCCTTTAATA AGATGGTAAA 360 

TGGGCACGGA ATCTTTCAAT AAATTTCTCT CCACGCTCTT CAAAAGTACT ATATTGATCC 42 0 

CAACTCGCAC AAGCAGGTGA CAATAATACA ACATCATTTG GTTCTATAAT ATCTTGTACT 4 80 

25 

TTATCAACAG CGTCTTCGAC ATTGTTCGCT TCAATGACCG ATTTCCCTTG ACTATTACCT 54 0 

AGTTTAGCAA ACTTAGCTTT CGTTTGTCCG AATACAACCA TCGCGCGAAC ATTTTCCATA 600 

TAAGGAATGA GTTCGTCAAA TTCATTCCCT CGATCCAAAC CACCACATAA CCAAATGATT 660 

30 

GGTTGATTAA ATGAATTTAA GGCAAACTGT GTTGCTAGCG TGTTTGTTGC TTTGGAATCA 720 

TTATAATATT TATTAGTTCT ATTAGTACCA ACATATTGCA ATCTATGCTC TATTCCTGAA 7 80 

AATGTAGTTA AACTATCAAT AATTGCtTTA ATAGGTACAC CAGCanAATA CAAGCAAGCA 84 0 

CAGCTGCTAA TATATTTcTA AATTATGTTC ACCAGGCAAT ACTAGAtCTT CAGTGTTAAT 900 

AATadGAACA CCTTTATaAA CGATAAAACC ATCTTtAATA TAAaTACCAT CArCTtCTTG 960 

40 TTGAGTTGAG AAATACAATG TCTTAGCTTT TAATTCTTCC GACTCTATCA CTTGTCTTTG 1020 

ATGATAATTA CAAATCAAAT AATCCTCTTC CGTTTGATTT TTATATATTT GCTTTTTAGC 1080 

ATTTTGATAG TTTTCTAAAT TTTCATGGTA ATCTAGATGC GCCGAATAAA TGTTAGTAAT 1140 

TATAGCAATG TGTGGTTTAT ACTTTTCGAT TCCAAGTAAC TGGAATGACG ACAACTCTGT 1200 

AACTAAATAA TCTGTAGGCT TTACTTCTTG TGCTACTTTA GATGCAACAT AACCAATATT 1260 

GCCGGATAAT CTTCCAGTTA AGCGACTTTT TTTAAACATA TCTCCAATTA GAGAAGTAAC 1320 

SO 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4280 base pairs 

55 
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(C) STRA^•DEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

TTTACACCAA TCAAAAAATC GAACTGATAT AAATAAGTAC AAAGCTTATC TATCAATCCG 60 

,g ATTTAGTTAT AAAACAAAAA AAGCCACAGT AATGTGGCTT TTTGTTATAT TCAGTATCAA 120 

AATGGTATCA ATAGCCATTT TCGGAAGTCA AGAATGGCTT AACAACGCGG TTTAAAGCTA 180 

TCCAATACTA CCTTCCATTT CGAACTTGAT TAAACGGTTC ATTTCGACCG CGTATTCCAT 24 0 

'5 TGGAAGTTCT TTTGTAAATG GTTCGATGAA TCCCATAACA ATCATTTCTG TCGCTTCTTC 3 00 

TTCAGAAATA CCACGACTCA TTAGATAGAA TAATTGTTCT TCAGAAACTT TTGAAACCTT 36 0 

GGCTTCATGT TCTAATGATA TTTGATCGTT GAATACTTCG TTATATGGAA TTGTATCTGA 42 0 

20 TGTTGATTCG TTATCTAAGA TTAATGTATC ACATTCAATA TTTGAACGAG CACCTTTTGC 480 

TTTACGTCCA AAATGAACAA TACCGCGATA AATAACTTTA CCACCATTTT TAGAAATAGA 54 0 

TTTAGAAACA ATTGTAGAAG ATGTATTAGG TGCTTTATGA ATCATTTTAG CACCGGCATC 600 

25 

TTGAACTTGT CCTTTACCAG CAAATGCAAT AGATAATGTA CTACCTTTTG CACCTTCACC 66 0 

TAAAAGAACA CAGTTTGGAT ATTTCATCGT TAACTTAGAA CCTAAGTTAC CATCTACCCA 72 0 

TTCCATATTT CCGTTTTCAT AAACAAAAGT ACGTTTTGTA ACTAAATTGT ATACATTGTT 780 

CGCCCAGTTT TGAATCGTAG TATAACGAAC GTGCGCATCT TTATGCACAA TGATTTCCAC 84 0 

AACAGCAGAG TGTAAAGAAC TAGTTGTATA AACTGGTGCA GTACAACCTT CTACGTAATG 900 

35 TACAGAAGCA CCTTCATCAG CAATGATTAA TGTACGTTCA AATTGACCCA TGTTCTCAGA 9G0 

GTTAATACGG AAATAAGCTT GTAGTGGCGT ATCTAGTTTG ATATTTTTAG GTACATAAAT 10 2 0 

GAAGGAACCA CCTGACCATA CTGCTGAGTT TAACGCCGCA AATTTGTTAT CTGCTGCAGG 1080 

TACTACAGAA GCAAAGTATT TTTTGAATAA TTCTTCATTT TCTTGTAAAG CACTATCTGT 1140 

ATCTTTAAAG ATAATACCTT TTTCTTCAAG TTCTTTTTCC ATATTATGGT AAACAACTTC 12 00 

AGATTCATAT TGAGCAGAAA CACCAGCTAA ATATTTTTGT TCAGCTTCAG GAATTCCTAA 1260 

4S 

TTTATCGAAA GTTCTTTTAA TTTCTTCTGG CACTTCATCC CATGAACGTT CAGCTTGTTC 13 20 

TGAAGGCTTT ACATAGTAAG TAATGTCATC GAAATTCAAT TCTGATAAGT CGCCACCCCA 13 80 

TTGAGGCATT GGCATTTTAT AAAACAATTT TAATGATTTA AGACGGAAAT CTAACATCCA 144 0 

50 

TTCCGGCTCA TTTTTCATGT TAGAAATTTC TCTAACGATA TTCTCAGTTA AACCACGTTC 1500 

TGATCTGAAA ATGGACACAT CATCGTCGTG GAATCCATAT TTATAATCCC CAACATCAGG 1560 
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TTTAATTCAT GATGTAAACC ATATTATAAC AATGACATGA CATCTTATAA AAATTTTTAT 1680 

ACTTTTATAT GTCTAATATC AAAATTATCT ATGATTAACA GCATTCTATT CTTCTTCAGT 174 0 

^ CGTACCTTCT GCTTTACCTT CTTTAGCAAC AGTACCTTTT TCCAATGCTT TCCAAGCTAA 180 0 

TGTGGCACAT TTAATACGAG CTGGGAATTG AGATACACCT TGCAATGCTT CAATATCTCC I860 

CATTTCTTCT GTAATCACAT AGTCTTCACC AAGCATCATT TTCGTAAATT CTTGGCTCAT 1920 

to 

TTGCATTGCT TCTCCAAGTG AATGACCTTT AACAGCTTGT GTCATCATCG ATGCACTTGC 1980 

CATTGAAATC GAACAACCTT CACCTTCAAA CTTAGCATCT TTTATAATGC CGTCTTCTAT 2 040 

ATCAAATGTT AGTCGTATAC GGTCACCGCA TGTCGGGTTA TTCATATCTA CTGTCATAGA 2100 

15 

CCCGTTATCT AATACACCTT TATTTCTAGG ATTTTTATAA TGATCCATAA TGACAGATCT 2160 

ATATAATTGA TCTAGATTAT TAAAATTCAT AAGAGAAAAA CTCCTTCGTT TGTTTCAAGG 2220 

CATTTATTAA CTGATCAACG TCTTCTTTCG TGTTGTATAT ATAAAAACTC GCTCTAGCTG 22 80 

TTGAAGACAC ATTTAACCAT TTCATTAACG GTTGCGCACA ATGATGCCCA GCTCTAACCG 234 0 

CTACACCTTC TGTATCTACG GCTGTAGCAA CATCGTGTGG ATGTACATCT TGTAAATTAA 24 00 

?S ACGTTATTAC ACCTGCACGA CGATCCTTTG GCGGGCCATA AATTTCAATT CCTTCAATTG 24 6 0 

CAGACATTTG CTCATAAGCA TATATCGTTA ATTCTTGTTC ATATTTATGA ATTGCATCAA 2520 

AACCTATGCG TTCTAAATAG CGAATAGCTT CTGCAAGCCC AATTGCTTGA GCAATTAATG 2 580 

^° GAGTACCCGC CTCAAATTTA GTAGGTAAAT CAGCCCATGT TGCATCATAC TTACTTACAA 264 0 

AATCAATCAT GTCGCCACCG AACTCAATCG GTTCCATTTT TTGTAGTAAC TCACGTTTAC 2700 

CAAATAATAC GCCAATACCT GTTGGTCCAA GCATTTTATG ACCACTAAAA CTATAAAAAT 27 60 

35 

CAGCATTCAT TTCTTGCATA TCAAGTTTCA TATGTGGTGC TGctTGCGCC CCATCAACAC 2 82 0 

TGATSATTGC ACCATGTTGA TGAGCTATTT CTGCAATGGT TTTAACATCA TTAATTGTAC 2 880 

CGAGCACATT AGATATATGT GCAATAGCAA CGATCTTTGT TTTATCATTA ATCGTTTGCT 294 0 

40 

TAATATCCTC GATGTTTAAT TCACCGTCAG CTGTCATTGG TATAAATTTC AATGTCGCAT 300 0 

TTTTACGCTT TGCTAACTGT TGCCAAGGAA CAATATTGGC ATGATGTTCC ATTTCAGTGA 3060 

45 CAACAATTTC ATCGCCCTCT TCAACATTTG CATCACCATA GCTATGTGCT ACAAGGTTAA 3120 

TCGACGCAGT TGTTCCGCGT GTAAAAATGA TTTCTTCAAA ATACTTCGCA TTAATAAAAC 3130 

GACGAACGGT TTCACGGGCA TTTTCATAAC CATCAGTTGC CAATGATCCT AATGTATGAA 3 24 0 

SO CACCACGATG AACGTTTGAA TTATAACGCT TGTAGTAATC TTCTAAAACA TTTAACACTT 3300 

GCACAGGCGT TTGACTTGTC GCTGTTGAAT CAAGATATGC TAAACGTTTG CCATTGACTT 3360 
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CTTCATTCAC GACCTTTCTT AAATAAAAAT CCTAATCATT TAAATACTGA CGTTGTATTA 34 80 

GTCTTATACC AATATCGACA GTCTATATCT ATTACAAACT TTTATTTTCA AAATATTATT 354 0 

5 

TAGAAACTTT GCGTTCAATT ACTTCTCTCA ATTGACGTTT AACGTCTTCG ATAGGTAATT 3 6 00 

CACGTACTAC TGGATCTAAG AAACCATGTA TAACAAGACG TTCCGCTTCT CTTTGAGAAA 3660 

TACCACGACT CATTAAATAG TAAAGTTGAT CTGGATCAAC ACGACCTACT GATGCAGCAT 3 720 

10 

GACCAGCTTG TACATCATCT TCATCAATTA ATAAAATAGG ATTCGCGTCA CCACGAGCAT 3780 

GTTCAGATAA CATTAATACA CGTGATTCCT GATTAGCAAT TGATTTAGTT CCACCATGCT 3 840 

TAATGTAGCC GATACCATTA AATACAGACG ATGCATGTTC TTTCATAACA CCATGTTTAA 3900 

GGATATAACC ATCTGTTTCT TTACCATATT GTACGATTTT AGATGTTAGA TTAATTTTTT 3960 

GTTCGCCTGT ACCTACAACT ACTGATTTAA GTGAACTTGT TGAACGATCA CCAAATAAAT 4020 

20 TTGTTGTATT ATCAATAATT TGGCTACCCT CATTCATTAA ACCTAGTGCC CAATTAATTG 4 0 80 

AGGCATCCGC TTCAGTAATA CCACGTCGAA TGATATGACC TGTAAAGCCT TTATCCATAT 414 0 

AGTCCACTGA GCCATATGTG ATATTTGAAT TTGCACCAGC AATCACTTCA GAAATAATAT 420 0 

^•5 TtAATTGATT TCCTTCACCA GATGCATTTG mTAAGTAATT TTCAACATAT GTGACTTCGG 4 26 0 

CGCTTTCTTC AGTAACGATG 4 2 80 
(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15598 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 

35 

~ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

TCnGACTCGA ACGGTGltiAAC TAttCCGTTG 

TGCaAAGAAA gaATGsAATG AACTTTTTGG 

ATATTACATT CTGAGAAGTA TAAAAGCTTG 
45 TAATGTAAAA ATTTATGTTC AATAAGTGTG 

AGAATAAATA TAGAATCGAA AATGGTGTCA 

TTTATTAATA TGCTTATGGT ATTTAGCTAA 
SO TCTGTTTGGC AGGTCAAGTT GTCCAATATG 

GAATTTCAGA AGTATTAGAA TTACCAAACT 



TaATTCCgGA GgAAsCAAGG 
AAATGTAGAA GTGGTAAATA 
AAATGAAATG GATATTCTGT 
TACTTTTACG TTAAATAGAT 
TCATTAGTGT TGCCGTTTTC 
AAGCGGATCA CATAATTTTT 
GAAGACATCG TAAACGTAGA 
TAATAGAAAT TCAAACTAAA 



TATGCCCATC 60 

AAGATAAAGG 120 

TATAGTTATA 180 

AAGTTAATTA 24 0 

TTTTTGTCTT 300 

GAGGGGTGAA 3 60 

AACTACGCGA 420 

TCTTACGAGT 4 80 
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CTGGTAATTT GTCATTAGAG 


TTTGTGGATT ACCGTTTAGG 


AGAACCAAAA 


TATGATTTAG 


600 




AAGAATCTAA AAACCGTGAC 


GCTACTTATG CTGCACCTCT 


TCGTGTAAAA 


GTGCGTCTAA 


660 


5 


TCATTAAAGA AACAGGAGAA 


GTTAAAGAAC AAGAAGTCTT 


TATGGGTGAT 


TTCCCATTAA 


720 




TGACTGATAC AGGTACGTTC 


GTTATCAATG GTGCAGAACG 


TGTAATCGTA 


TCTCAATTAG 


780 


W 


TTCGTTCACC ATCCGTTTAT 


TTCAATGAAA AAATCGACAA 


AAATGGTCGT 


GAAAACTATG 


840 


ATGCAACAAT TATTCCAAAC 


CGTGGTGCAT GGTTAGAATA 


TGAAACAGAT 


GCTAAAGATG 


900 




TTGTATACGT ACGTATTGAT 


AGAACACGTA AACTACCATT 


AACAGTATTG 


TTACGTGCAT 


960 




TAGGTTTCTC AAGCGACCAA 


GAAATTGTTG ACCTTTTAGG 


TGACAATGAA 


TATTTACGTA 


1020 




ATACTTTAGA GAAAGACGGC 


ACTGAAAACA CTGAACAAGC 


GTTATTAGAA 


ATCTATGAAC 


1080 




GTTTACGTCC AGGTGAACCA 


CCAACTGTTG AAAATGCTAA 


AAGTCTATTG 


TATTCACGTT 


1140 


20 


TCTTTGATCX: AAAACGCTAT 


GACTTAGCAA GCGTGGGTCG 


TTATAAAACA 


AACAAAAAAT 


1200 




TACATTTAAA ACATCGTTTA 


TTTAATCAAA AATTAGCTGA 


GCCAATTGTA 


AATACTGAAA 


1260 




CTGGTGAAAT TGTAGTTGAA 


GAAGGTACAG TGCTTGATCG 


TCGTAAAATC 


GACGAAATCA 


1320 


25 


TGGATGTACT TGAATCAAAT 


GCAAACAGCG AAGTGTTTGA 


ATTGCATGGT 


AGCGTTATAG 


1380 




ACGAGCCAGT AGAAATTCAA 


TCAATTAAAG TATATGTTCC 


TAACGATGAT 


GAAGGTCGTA 


1440 




CGACAACTGT AATTGGTAAT 


GCTTTCCCTG ACTCAGAAGT 


TAAATGCATT 


ACACCAGCAG 


1500 


30 


ATATCATTGC TTCAATGAGT 


TACTTCTTTA ACTTATTAAG 


CGGTATTGGA 


TATACAGATG 


1560 




ATATTGACCA TTTAGGTAAC 


CGTCGTTTAC GTTCTGTAGG 


TGAATTACTA 


CAAAACCAAT 


1620 




TCCGTATCGG TTTATCAAGA 


ATGGAAAGAG TTGTACGTGA 


AAGAATGTCA 


ATTCAAGATA 


1680 


35 


CTGAGTCTAT CACACCTCAA 


CAATTAATTA ATATTCGACC 


TGTTATTGCA 


TCTATTAAAG 


1740 




aatiStttgg tagctctcaa 


TTATCACAAT TCATGGACCA 


AGCAAACCCA 


TTAGCTGAGT 


1800 




TAACGCATAA ACGTCGTCTA 


TCAGCATTAG GACCTGGTGG 


TTTAACACGT 


GAACGTGCTC 


1860 




AAATGGAAGT ACGTGACGTT 


CACTACTCTC ACTATGGCCG 


TATGTGTCCA 


ATTGAAACAC 


1920 




CTGAGGGACC AAACATTGGA 


TTGATTAACT CATTATCAAG 


TTATGCACGT 


GTAAATGAAT 


1980 


45 


TCGGCTTTAT TGAAACACCA 


TATCGTAAAG TTGATTTAGA 


TACACATGCT 


ATCACTGATC 


2040 




AAATTGACTA TTTAACAGCT 


GACGAAGAAG ATAGCTATGT 


TGTAGCACAA 


GCAAACTCTA 


2100 




AATTAGATGA AAATGGTCGT 


TTCATGGATG ATGAAGTTGT 


ATGTCGTTTC 


CGTGGTAACA 


2160 


50 


ATACAGTTAT GGCTAAAGAA 


AAAATGGATT ATATGGATGT 


ATCGCCGAAG 


CAAGTTGTTT 


2220 




CAGCAGCGAC AgcATGTATT 


CCATTCTTAG AAAATGATGA 


CTCAAACCGT 


GCATTGATGG 


2280 
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CAGGTATGGA ACACGTTGCA GCACGTGATT CTGGTGCGGC TATTACAGCT AAGCACAGAG 24 0 0 

GTCGTGTTGA ACATGTTGAA TCTAATGAAA TTCTTGTTCG TCGTCTAGTT GAAGAGAACG 24 6 0 

GCGTTGAGCA TGAAGGTGAA TTAGATCGCT ATCCATTAGC TAAATTTAAA CGTTCAAACT 25 20 

CAGGTACATG TTACAACCAA CGTCCAATCG TTGCAGTTGG AGATGTTGTT GAGTATAACG 25 80 

AGATTTTAGC AGATGGACCA TCTATGGAAT TAGGAGAAAT GGCATTAGGT AGAAACGTAG 2 64 0 

TAGTTGGTTT CATGACTTGG GACGGTTACA ACTATGAGGA TGCCGTTATC ATGAGTGAAA 27 00 

GACTTGTGAA AGATGACGTG TATACTTCTA TTCATATTGA AGAGTATGAA TCAGAAGCAC 276 0 

GTGATACTAA GTTAGGACCT GAAGAAATCA CAAGAGATAT TCCTAATGTT TCTGAAAGTG 2820 

CACTTAAGAA CTTAGACGAT CGTGGTATCG TTTATATTGG TGCAGAAGTA AAAGATGGAG 28 80 

ATATTTTAGT TGGTAAAGTA ACGCCTAAAG GTGTAACTGA GTTAACTGCC GAAGAAAGAT 2940 

TGTTACATGC AATCTTTGGT GAAAAAGCAC GTGAAGTTAG AGATACTTCA TTACGTGTAC 30 0 0 

CTCACGGCGC TGGCGGTATC GTTCTTGATG TAAAAGTATT CAATCGTGAA GAAGGCGACG 3 06 0 

ATACATTATC ACCTGGTGTA AACCAATTAG TACGTGTATA TATCGTTCAA AAACGTAAAA 3120 

TTCATGTTGG TGATAAGATG TGTGGTCGAC ATGGTAACAA AGGTGTCATT TCTAAGATTG 3180 

TTCCTGAAGA AGATATGCCT TACTTACCAG ATGGACGTCC GATCGATATC ATGTTAAATC 324 0 

CTCTTGGTGT ACCATCTCGT ATGAACATCG GACAAGTATT AGAGCTACAC TTAGGTATGG 3300 

CTGCTAAAAA TCTTGGTATT CACGTTGCAT CACCAGTATT TGACGGTGCA AACGATGACG 336 0 

ATGTATGGTC AACAATTGAA GAAGCTGGTA TGGCTCGTGA TGGTAAAACT GTACTTTATG 3420 

ATGGACGTAC AGGTGAACCA TTCGATAACC GTATTTCAGT AGGTGTAATG TACATGTTGA 34 30 

AACTTGCGCA CATGGTTGAT GATAAATTAC ATGCGCGTTC AACAGGACCA TATTCACTTG 354 0 

tTAcicAACA ACCACTTGGC GGTAAAGCGC AATTCGGTGG ACAACGTTTT GGTGAGATGG 36 0 0 

AGGTATGGGC ACTTGAAGCA TATGGTGCTG CATACACATT ACAAGAAATC TTAACTTACA 366 0 

AATCCGATGA TACAGTAGGA CGTGTGAAAA CATACGAGGC TATTGTTAAA GGTGAAAACA 372 0 

TCTCTAGACC AAGTGTTCCA GAATCATTCC GAGTATTGAT GAAAGAATTA CAAAGTTTAG 37 80 

GTTTAGATGT AAAAGTTATG GATGAGCAAG ATAATGAAAT CGAAATGACA GACGTTGATG 3 34 0 

ACGATGATGT TGTAGAACGC AAAGTAGATT TACAACAAAA TGATGCTCCT GAAACACAAA 3 90 0 

AAGAAGTTAC TGATTAATAC GCAATTTACA AAACAGGCAA AAAGATACTA AGCTGAATTT 3 960 

TATTGATGAT TCAGTTTAGT ACTTTAAGCC ATTTTAAATA AATGCAAATC AATCAAATAG 4020 

CACAGCTAAT CTAAATTGAA GGAGGTAGGC TCCTTGATTG ATGTAAATAA TTTCCATTAT 40 80 
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AAACCTGAAA CAATCAACTA CCGTACATTA AAACCTGAAA AAGATGGTCT ATTCTGTGAA 42 0 0 

AGAATTTTCG GACCTACAAA AGACTGGGAA TGTAGTTGTG GTAAATACAA ACGTGTTCGC 4 26 0 

^ TACAAAGGCA TGGTCTGTGA CAGATGTGGA GTTGAAGTAA CTAAATCTAA AGTACGTCGT 4 3 20 

GAAAGAAT3G GTCACATTGA ACTTGCTGCT CCAGTTTCTC ACATTTGGTA TTTCAAAGGT 4 3 80 

ATACCAAGTC GTATGGGATT ATTACTTGAC ATGTCACCAA GAGCATTAGA AGAAGTTATT 444 0 

W 

TACTTTGCTT CTTATGTTGT TGTAGATCCA GGTCCAACTG GTTTAGAAAA GAAAACTTTA 4 500 

TTATCTGAAG CTGAATTCAG AGATTATTAT GATAAATACC CAGGTCAATT CGTTGCAAAA 4560 

ATGGGTGCAG AAGGTATTAA AGATTTACTT GAAGAGATTG ATCTTGACGA AGAACTTAAA 4 620 

TTGTTACGCG ATGAGTTGGA ATCAGCTACT GGTCAAAGAC TTACTCGTGC AATTAAACGT 4 680 

TTAGAAGTTG TTGAATCATT CCGTAATTCA GGTAACAAAC CTTCATGGAT GATTTTAGAT 4740 

20 GTACTTCCAA TCATCCCACC AGAAATTCGT CCAATGGTTC AATTAGATGG TGGACGATTT 4 800 

GCAACAAGTG ACTTAAACGA CTTATACCGT CGTGTAATTA ATCGAAATAA TCGTTTGAAA 4 8 60 

CGTTTATTAG ATTTAGGTGC ACCTGGTATC ATCGTTCAAA ACGAAAAACG TATGTTACAA 4 920 

GAAGCCGTTG ACGCTTTAAT TGATAATGGT CGTCGTGGTC GTCCAGTTAC TGGCCCAGGT 4 980 

AACCGTCCAT TAAAATCTTT ATCTCATATG TTAAAAGGTA AACAAGGTCG TTTCCGTCAA 504 0 

AACTTACTTG GTAAACGTGT TGACTATTCA GGACGTTCAG TTATTGCAGT AGGTCCAAGC 5100 

^° TTGAAAATGT ACCAATGTGG TTTACCAAAA GAAATGGCAC TTGAACTATT TAAACCATTC 5160 

GTAATGAAAG AATTAGTTCA ACGTGAAATT GCAACTAACA TTAAAAATGC GAAGAGTAAA 5220 

ATCGAACGTA TGGATGATGA AGTTTGGGAC GTATTGGAAG AAGTAATTAG AGAACATCCT 52 80 

35 

GTATTACTTA ACCGTGCACC AACACTTCAT AGACTTGGTA TTCAAGCATT TGAACCAACT 534 0 

TTAGTTGAAG GTCGTGCGAT TCGTCTACAT CCACTTGTAA CAACAGCTTA TAACGCTGAC 54 00 

TTTGACGGTG ACCAAATGGC GGTTCACGTT CCTTTATCAA AAGAGGCACA AGCTGAAGCA 5460 

40 

AGAATGTTGA TGTTAGCAGC ACAAAACATC TTGAACCCTA AAGATGGTAA ACCTGTAGTT 5520 

ACACCATCAC AAGATATGGT ACTTGGTAAC TATTACCTTA CTTTAGAAAG AAAAGATGCA 558 0 

45 GTAAATACAG GCGCAATCTT TAATAATACA AATGAAGTAT TAAAAGCATA TGCAAATGGC 564 0 

TTTGTACATT TACACACTAG AATTGGTGTA CATGCAAGTT CGTTCAATAA TCCAACATTT 57 00 

ACTGAAGAAC AAAACAAAAA GATTCTTGCT ACGTCAGTAG GTAAAATTAT ATTCAATGAA 57 6 0 

^0 ATCATTCCAG ATTCATTTGC TTATATTAAT GAACCTACGC AAGAAAACTT AGAAAGAAAG 5820 

ACACCAAACA GATATTTCAT CGATCCTACA ACTTTAGGTG AAGGTGGATT AAAAGAATAC 588 0 

55 
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GAAGTATTCA ACAGATTTAG CATCACTGAT ACATCAATGA TGTTAGACCG TATGAAAGAC 60 0 0 

TTAGGATTCA AATTCTCATC TAAAGCTGGT ATTACAGTAG GTGTTGCTGA TATCGTAGTA 60 6 0 

TTACCTGATA AGCAACAAAT ACTTGATGAG CATGAAAAAT TAGTCGACAG AATTACAAAA 6120 

CAATTCAACC GTGGTTTAAT CACTGAAGAA GAAAGATATA ATGCAGTTGT TGAAATTTGG 6180 

ACAGATGCAA AAGATCAAAT TCAAGGTGAA TTGATGCAAT CACTTGATAA AACTAACCCA 6240 

0 

ATCTTCATGA TGAGTGATTC AGGTGCCCGT GGTAACGCAT CTAACTTTAC ACAGTTAGCA 6300 

GGTATGCGTG GATTGATGGC CGCACCATCT GGTAAGATTA TCGAATTACC AATCACATCT 6360 

TCATTCCGTG AAGGTTTAAC AGTACTTGAA TACTTCATCT CAACTCACGG TGCACGTAAA 6420 

5 

GGTCTTGCCG ATACAGCACT TAAAACAGCT GACTCAGGAT ATCTTACTCG TCGTCTTGTT 6480 

GACGTGGCAC AAGATGTTAT TGTTCGTGAA GAAGACTGTG GTACTGATAG AGGTTTATTA 6540 

3 GTTTCTGATA TTAAAGAAGG TACAGAAATG ATTGAACCAT TTATCGAACG TATTGAAGGT 6600 

CGTTATTCTA AAGAAACAAT TCGTCATCCT GAAACTGATG AAATAATCAT TCGTCCTGAT 6660 

GAATTAATTA CACCTGAAAT TGCTAAGAAA ATTACAGATG CTGGTATTGA ACAAATGTAT 6720 

5 ATTCGCTCAG CATTTACTTG TAACGCACGA CATGGTGTTT GTGAAAAATG TTACGGTAAA 67 8 0 

AACCTTGCTA CTGGTGAAAA AGTTGAAGTT GGTGAAGCAG TTGGTACAAT TGCAGCCCAA 684 0 

TCTATCGGTG AACCAGGTAC ACAGCTTACA ATGCGTACAT TCCATACAGG TGGGGTAGCA 690 0 

" GGTAGCGATA TCACACAAGG TCTTCCTCGT ATTCAAGAGA TTTTCGAAGC ACGTAACCcT 696 0 

AAAGGTCAAG CGGTAATTAC GGAAATCGAA GGTGTCGTAG AAGATATTAA ATTAGCAAAA 7020 

GATAGACAAC AAGAAATTGT TGTTAAAGGT GCTAATGAAA CAAGATCATA CCTTGCTTCA 7080 

s 

GGTACTTCAA GAATTATTGT AGAAATCGGT CAACCAGTTC AACGTGGTGA AGTATTAACT 714 0 

GAAGGXTCTA TTGAACCTAA GAATTACTTA TCTGTTGCTG GATTAAACGC GACTGAAAGC 7200 

TACTTATTAA AAGAAGTACA AAAAGTTTAC CGTATGCAAG GTGTAGAAAT CGACGATAAA 7260 

0 

CACGTTGAGG TTATGGTTCG ACAAATGTTA CGTAAAGTTA GAATTATCGA AGCAGGTGAT 7320 

ACX3AAGTTAT TACCAGGTTC ATTAGTTGAT ATTCATAACT TTACAGATGC AAATAGAGAA 73 80 

5 GCATTTAAAC ACCGTAAGCG TCCTGCAACA GCTAAACCAG TATTACTTGG TATTACTAAA 7440 

GCATCACTTG AAACAGAAAG TTTCTTATCT GCAGCATCAT TCCAAGAAAC AACAAGAGTT 7500 

CTTACAGATG CAGCAATTAA AGGTAAGCGT GATGACTTAT TAGGTCTTAA AGAAAACGTA 756 0 

0 ATTATTGGTA AGTTAATTCC AGCTGGTACT GGTATGAGAC GTTATAGCGA CGTAAAATAC 7620 

GAAAAAACAG CTAAACCAGT TGCAGAAGTT GAATCTCAAA CTGAAGTAAC GGAATAACAA 7680 
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ATGTTGACGA ATTCTCTTGT TCAATGTTAA TATATTAAAG GTTGATGCAA GCAGAACTTT 78 0 0 

GGAGGATAAA TTATTGTCTA AGGAAAAAGT tGCACGCTTT AACAAACAAC ATTTTGTAGT 78 60 

TGGTCTTAAA GAAACGCTTA AAGCGTTAAA GAAAGATCAA GTTACATCTT TGATTATTGC 7 920 

TGAAGACGTT GAAGTATATT TAATGACTCG CGTGTTAAGC CAAATCAATC AGAAAAATAT 7980 

ACCTGTATCT TTTTTCAAAA GCAAACATGC TTTGGGTAAA CATGTAGGTA TTAACrGTCAA 8 040 

TGCGACAATA GTAGCATTGA TTAAATGAGA ATTAGTAAGT GTTTTACTTA CTAAATTTTA 8100 

TTTAACCTAA AAATGAACCA CCTGGATGTG TGGGATTAAA AAGTGAAGAG AGGAGGACAT 8160 

ATCACATGCC AACTATTAAC CAATTAGTAC GTAAACCAAG ACAAAGCAAA ATCAAAAAAT 8220 

CAGATTCTCC AGCTTTAAAT AAAGGTTTCA ACAGTAAAAA GAAAAAATTT ACTGACTTAA 8280 

ACTCACCACA AAAACGTGGT GTATGTACTC GTGTAGGTAC AATGACACCT AAAAAACCTA 8340 

ACTCAGCGTT ACGTAAATAT GCACGTGTGc gTtTATCAAA CAACATCGAA ATTAACGCAT 8400 

ACATCCCTGG TATCGGACAT AACTTACAAG AACACAGTGT TGTACTTGTA CGTGGTGGAC 84 60 

GTGTAAAAGA CTTACCAGGT GTGCGTTACC ATATTGTACG TGGAGCACTT GATACTTCAG 8520 

GTGTTGACGG ACGTAGACAA GGTCGTTCAT TATACGGAAC TAAGAAACCT AAAAACTAAG 8 58 0 

AATTTAGTTT TTAATTAAAT CTTAAACTTA AAATATTTAA TATAAGGAAG GGAGGATTTA 8640 

CATTATGCCT CGTAAAGGAT CAGTACCTAA AAGAGACGTA TTACCAGATC CAATTCATAA 8700 

CTCTAAGTTA GTAACTAAAT TAATTAACAA AATTATGTTA GATGGTAAAC GTGGAACAGC 8760 

ACAAAGAATT CTTTATTCAG CATTCGACCT AGTTGAACAA CGCAGgtTCG TGATGCATTA 8820 

GAAGTATTCG AAGAAGCAAT CAACAACATT ATGCCAGTAT TAGAAGTTAA AGCTCGTCGC 3880 

GTAGGTGGTT CTAACTATCA AGTACCAGTA GAAGTTCGTC CAGAGCGTCG TACTACTTTA 8940 

GGTTTACGTT GGTTAGTTAA CTATGCACX3T CTTCGTGGTG AAAAAACGAT GGAAGATCGT 9000 

TTAGCTAACG AAATTTTAGA TGCAGCAAAT AATACAGGTG GTGCCGTTAA GAAACGTGAG 9060 

GACACTCACA AAATGGCTGA AGCAAACAAA GCATTTGCTC ACTACCGTTG GTAAGATAAA 9120 

AGCTTTTACC CTGAGTGTGT TCTATATTAA TGAATTTTCA TTAAGCGTTC ATGCTTAGGG 9180 

CATCGCCATA TCTATCGTAT TTATTCAGTA ATATAAACTG GAAGGAGAAA AAATACATGG 924 0 

CTAGAGAATT TTCATTAGAA AAAACTCGTA ATATCGGTAT CATGGCTCAC ATTGATGCTG 93 00 

GTAAAACGAC TACGACTGAA CGTATTCTTT ATTACACTGG CCGTATCCAC AArGknGGTG 93 60 

AAaCACACGA AGGTGCTTCA CAAATGGACT GGATGGAGCA AGAACAAGAC CGTGGTATTA 94 20 

CTATCACATC TGCTGCAACA ACAGCAGCTT GGGAAGGTCA CCGTGTAAAC ATTATCX5ATA 94 80 
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CAGTTACAGT ACTTGATGCA CAATCAGGTG TTGAACCTCA AACTGAAACA GTTTGGCGTC 96 00 

AC5GCTACAAC TTATGGTGTT CCACGTATCG TATTTGTAAA CAAAATGGAC AAATTAGGTG 96 6 0 

^ CTAACTTCGA ATACTCTGTA AGTACATTAC ATGATCGTTT ACAAgCTAAC GCTGCTCCAA 97 2 0 

TCCAATTACC AATTGGTGCG GAAGACGAAT TCGAAGCAAT CATTGACTTA GTTGAAATGA 97 8 0 

AATGTTTCAA ATATACAAAT GATTTAGGTA CTGAAATTGA AGAAATTGAA ATTCCTGAAG 984 0 

ACCACTTAGA TAGAGCTGAA GAAGCTCGTG CTAGCTTAAT CGAAGCAGTT GCAGAAACTA 990 0 

GCGACGAATT AATGGAAAAA TATCTTGGTG ACGAAGAAAT TTCAGTTTCT GAATTAAAAG 9960 

AAGCTATCCG CCAAGCTaCt AcTAACGTAG AATTCTACCC AGTACTTTGT GGTACAGCTT 10020 

15 

TCAAAAACAA AGGTGTTCAA TTAATGCTTG ACGCTGTAAT TGATTACTTA CCTTCACCAC 10080 

TAGACGTTAA ACCAATTATT GGTCACCGTG CTAGCAACCC TGAAGAAGAA GTAATCGCGA 1014 0 

20 AAGCAGACGA TTCAGCTGAA TTCGCTGCAT TAGCGTTCAA AGTTATGACT GACCCTTATG 10200 

TTGGTAAATT AACATTCTTC CGTGTGTATT CAGGTACAAT GACATCTGGT TCATACGTTA 1026 0 

AGAACTCTAC TAAAGGTAAA CGTGAACGTG TAGGTCGTTT ATTACAAATG CACGCTAACT 1032 0 

25 CACGTCAAGA AATCGATACT GTATACTCTG GAGATATCGC TGCTGCGGTA GGTCTTAAAG 10380 

ATACAGGTAC TGGTGATACT TTATGTGGTG AGAAAAATGA CATTATCTTG GAATCAATGG 10440 

AATTCCCAGA GCCAGTTATT CACTTATCAG TAGAGCCAAA ATCTAAAGCT GACCAAGATA 10500 

^° AAATGACTCA AGCTTTAGTT AAATTACAAG AAGAAGACCC AACATTCCAT GCACACACTG 1056 0 

ACGAAGAAAC TGGACAAGTT ATCATCGGTG GTATGGGTGA GCTTCACTTA GACATCTTAG 1062 0 

TAGACCGTAT GAAGAAAGAA TTCAACGTTG AATGTAACGT AGGTGCTCCA ATGGTTTCAT 10680 

35 

ATCGTGAAAC ATTCAAATCA TCTGCACAAG TTCAAGGTAA ATTCTCTCGT CAATCTGGTG 1074 0 

GTCGTGGTCA ATACGGTGAT GTTCACATTG AATTCACACC AAACGAAACA GGCGCAGGTT 10800 

TCGAATTCGA AAACGCTATC GTTGGTGGTG TAGTTCCTCG TGAATACATT CCATCAGTAG 10860 

AAGCTGGTCT TAAAGATGCT ATGGAAAATG GTGTTTTAGC AGGTTATCCT TTAATTGATG 10920 

TTAAAGCTAA ATTATATGAT GGTTCATACC ATGATGTCGA TTCATCTGAA ATGGCCTTCA 109 80 

^5 AAATTGCTGC ATCATTAGCA CTTAAAGAAG CTGCTAAAAA ATGTGATCCT GTAATCTTAG 1104 0 

AACCAATGAT GAAAGTAACT ATTGAAATGC CTGAAGAGTA CATGGGTGAT ATCATGGGTG 11100 

ACGTAACATC TCGTCGTGGA CGTGTTGATG GTATGGAACC TCGTGGTAAT GCACAAGTTG 1116 0 

50 TTAATGCTTA TGTACCACTT TCAGAAATGT TCGGTTATGC AACATCATTA CGTTCAAACA 11220 

CTCAAGGTCG CGGTACTTAC ACTATGTACT TCGATCACtA TGCTGAAGTT CCaiAAATCaA 112 8 0 
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GCCTAGGTTA AAATACAAGG TGAGCTTAAA TGTAAGCTAT CATCTTTATA GTTTGATTTT 114 0 0 

TTGGGGTGAA TGCATTATAA AAGAATTGTA AAATTCTTTT TGCATCGCTA TAAATAATTT 114 6 0 

CTCATGATGG TGAGAAACTA TCATGAGAGA TAAATTTAAA TATTATTTTT AATTAGAATA 11520 

GGAGAGATTT TATAATGGCA AAAGAAAAAT TCGATCGTTC TAAAGAACAT GCCAATATCG 11580 

GTACTATCGG TCACGTTGAC CATGGTAAAA CAACATTAAC AGCAGCAATC GCTACTGTAT 11640 

TAGCAAAAAA TGGTGACTCA GTTGCACAAT CATATGACAT GATTGACAAC GCTCCAGAAG 11700 

AAAAAGAACG TGGTATCACA ATCAATACTT CTCACATTGA GTACCAAACT GACAAACGTC 11760 

ACTACGCTCA CGTTGACTGC CCAGGACACG CTGACTACGT TAAAAACATG ATCACTGGTG 118 20 

CTGCTCAAAT GGACGGCGGT ATCTTAGTAG TATCTGCTGC TGACGGTCCA ATGCCACAAA 11S80 

CTCGTGAACA CATTCTTTTA TCACGTAACG TTGGTGTACC AGCATTAGTA GTATTCTTAA 11940 

ACAAAGTTGA CATGGTTGAC GATGAAGAAT TATTAGAATT AGTAGAAATG GAAGTTCGTG 12000 

ACTTATTAAG CGAATATGAC TTCCCAGGTG ACGATGTACC TGTAATCGCT GGTTCAGCAT 12060 

TAAAAGCTTT AGAAGGCGAT GCTCAATACG AAGAAAAAAT CTTAGAATTA ATGGAAGCTG 12120 

TAGATACTTA CATTCCAACT CCAGAACGTG ATTCTGACAA ACCATTCATG ATGCCAGTTG 1218 0 

AGGACGTATT CTCAATCACT GGTCGTGGTA CTGTTGCTAC AGGCCGTGTT GAACGTGGTC 1224 0 

AAATCAAAGT TGGTGAAGAA GTTGAAATCA TCGGTTTACA TGACACATCT AAAACAACTG 12 3 00 

TTACAGGTGT TGAAATGTTC CGTAAATTAT TAGACTACGC TGAAGCTGGT GACAACATTG 123 50 

GTGCATTATT ACGTGGTGTT GCTCGTGAAG ACGTACAACG TGGTCAAGTA TTAGCTGCTC 12420 

CTGGTTCAAT TACACCACAT ACTGAATTCA AAGCAGAAGT ATACGTATTA TCAAAAGACG 12480 

AAGGTGGACG TCACACTCCA TTCTTCTCAA ACTATCGTCC ACAATTCTAT TTCCGTACTA 12540 

CTGAOGTAAC TGGTGTTGTT CACTTACCAG AAGGTACTGA AATGGTAATG CCTGGTGATA 12600 

ACGTTGAAAT GACAGTAGAA TTAATCGCTC CAATCGCGAT TGAAGACGGT ACTCGTTTCT 12660 

CAATCCGTGA AGGTGGACGT ACTGTAGGAT CAGGCGTTGT TACTGAAATC ATTAAATAAT 12720 

TTCTAATTTC TTAGATTTTA TATAAAAAGA AGATCCCTCA ATCGAGGGGt CTTTTTTTAA 12780 

TGTGTAAATT TTGTAATGGC TATTCGATTT AGAAGAACAA TAATTGATGA AAGACTGACT 12840 

AATAAAACTT ATAACTGATA ATACTGTTTA AATAAAATTG TTGAGTCTTG GACATTGTAA 12 900 

AATGCTCCCT TCAAAGTTTT CATTTTTTCa ATGTCTACTT TGAAGGGAGC ATTTCATTAG 12 960 

TTTATGTCTC AGATTCATAT CTTTCAATTA ATTTAAATGC TTAATTTGTT TTAAATACTT 13 020 

GCTCTAATTC TATGATTTTT AAAAATACAG CTACAGCGTA TTTTAATGAT TTTTCATCAA 13080 
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TCAGAAAGAA TGCACCTGGT CGTACTTTCA AATAATGTGA AAAATCTTCT CCAATCATCA 13 200 

TTAAATCTGA TTCATTAAAG CGTACATGTA AGTCATTTGT TGCTTCTTTA ATAACTTGAT 132 SO 

5 ATGCTTTCTC GTTATTATGG ACAGGCAAAT ACCCTTTAAT ATAATTCAAA TCATAGTTAA 13 320 

TATCATTTGC TATTGCTAAA CCTTGTAGAA GCTTATCCAT TTTGTCCATT ACATGATTCT 13 3 80 

GTATATCTGA ATCGAAAGTT CTAACTGTAC CTTTACAAAA TGCTTGATCA GGAATAACGC 13440 

TATCTGTGGT GCCTGCTTGA ATCATTCCAA ATGAAAGTAC AGCTTGTTTA ACTGGATCGA 13 500 

TCGTACGTGA AATTATTTTT TGTGCACTTA AAATGAACTC TGCCATGATT ACTATTGGGT 13 560 

CAATGGTTTC ATGAGGTTTG GCACCATGAC CACCACGACC TTTAAATGTG ACGCTAAATT 13620 

15 

CATCTGGAGA GGCCATGATT GCCCCCGCAC GTGAATGAAT AGTTCCAGTA GGATAACCAC 13 680 

TCCATAAATG TGTACCGTAA ATTCTATCTA CATTTTCCAG ACATCCAGCA TCTATCATTT 13 740 

CTTGAGAACC ACCTGGCATG ATTTCTTCAC CGTACTGGAA TATTAATACA ACATTACCTT 13 800 

CTAATAAATG TTTATGTTCA TCTAAAATCT CTGCTACAGT AAGTAAAATT GCTGTATGAC 13 860 

CATCATGCCC ACACGCATGC ATACATCCTG GATTTTTAGA CTTATAAGGC ACATCGTTTA 13920 

25 ATTCCTCGAC AGGTAACGCA TCAAAGTCAG CTCTTAATGC AATGGTAGGT CCTGTGCCCA 13 980 

AGCCTTTAAA TGTGGCTTTG ATACCATTGC GGCCGATAGG AGTTTCAATA TCACAAGATA 1404 0 

ACTGGCTTAA TTGGTTAACA ATATAATCAT GTGTTTGAAA TTCTTCAAAA GATAACTCAG 14100 

30 GATATTGGTG TAAATAACGT CTGAGTTGAA TTGTTTTATT TTCTTTATTA TTTGCTAGTT 14160 

GGAACCAATC TAACACCCTT ATCACTACTT TCTAAAATAA TGTTTATAGT ATAACATTTT 14220 

ATGAAATTAT CGTACTAAAT GATTGCTTTG AGATATTTTA TCTATGAATG ATAAGGCTTT 14280 

CAAGTTATGT AGAATTACTG TATGATAAAG GTATTACCAA ACAATACTTA AGGGGGATTA 14340 

TAT;SrrGTGG TTCAATCATT ACATGAGTTT TTAGAGGAAA ATATAAATTA TCTAAAAGAA 14400 

AATGGTTTGT ATAATGAAAT AGATACAATT GAAGGTGCAA ACGGACCAGA AATCAAAATC 144 60 

AATGGGAAAT CATACATTAA CTTATCTTCA AATAATTATT TAGGACTAGC AACAAATGAA 14520 

GATTTGAAAT CaGctGCAAA AGCAGCTATT GATACACATG GTGTAGGTGC AGGCGCTGTT 14 5 80 

CGTACAATCA ATGGTACATT AGATTTACAC GACGAATTAG AAGAAACACT AGCAAAATTT 14 640 

AAAGGAACAG AAGCTGCAAT AGCTTATCAA TCAGGATTTA ATTGTAATAT GGCTGCTATT 14700 

TCAGCTGTCA TGAATAAAAA TGATGCTATT TTATCAGATG AGCTTAATCA TGCATCAATT 14760 

50 ATTGATGGAT GTCGCTTATC TAAAGCTAAA ATTATTCGAG TTAACCATTC AGACATGGAT 14 820 

GATTTACGTG CGAAAGCAAA AGAAGCAGTT GAATCAGGTC AATACAATAA AGTGATGTAT 148 80 
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ATTGCAGAAG AATTTGGTTT ATTAACTTAT GTTGACGACG CTCATGGTTC AGGTGTTATG 15000 

GGTAAAGGCG CT3GTACGGT TAAACATTTT GGTTTACAAG ATAAAATCGA TTTCCAAATA 15 060 

GGTACGCTTT CTAAAGCAAT TGGTGTCGTT GGCGGTTATG TAGCAGGTAC AAAAGAGTTA 1512 0 

ATAGATTGGT TAAAAGCACA ATCACGACCA TTCTTATTCT CTACATCATT AGCACCTGGG 15180 

GATACCAAAG CAATAACTGA AGCAGTTAAA AAGTTAATGG ATTCAACTGA ATTACATGAT 15240 

AAATTATGGA ACAATGCACA ATATTTAAAA AATGGATTGT CAAAATTAGG ATATGATACA 15300 

GGTGAGTCAG AAACTCCAAT TACACCAGTA ATTATTGGTG ATGAAAAAAC AACTCAAGAA 15360 

TTTAGTAAGC GTTTAAAAGA CGAAGGTGTC TATGTGAAAT CTATCGTTTT CCCAACAGTA 15420 

CCAAGAGGTA CAGGACGTGT AAGAAATATG CCTACAGCTG CACATACAAA AGACATGTTA 15480 

GATGAAGCAA TTGCGGCTTA TGAAAAAGTA GGAAAAGAAA TGAAGTTGAT TTAATATTTA 15540 

TTTATTCCCA CGGCAAATAT TGTCGTGGGC TTTTTTTAAT GTTTAGTTTA TTAACAGT 15598 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

AAGTAAATCA ACTTACTGGG ATAAGAATAA AGGCGATTAT AGTAACAAGT TGATTTTATT 60 

CGAAAAACAT TTTGAACCGG TTCTGGGTAT CAAGATGCAA CATAGTGGAG GTCATAGCTT 120 

TGGCCACACG ATTATTACGA TTGAAAGTCA AGGAGATAAA GCAGTTCATA TGGGTGATAT 180 

ATTCCCAACT ACTGCACATA AAAATCCTCT ATGGGTAACG GCATATGATG ATTATCCTAT 240 

GCAATCGATT CGTGAAAAAG AACGCATGAT ACCATATTTT ATTCAGCAAC AATATTGGTT 3 00 

CTTGTTTTAT CATGATGAAA ACTACTTTGC TGTAAAATAC AGCGATAATG GTGAAAACAT 360 

AGATGCATAT ATTTTACGTG AAACATTAGT TGATAATAAC TAAAATAAAG ATGTATTACT 4 20 

AAACAAATTT TCAAAAATAA AAAATTGAGC CACATCCAAT CTTACTAATT AGGGTGTGGC 4 80 

TCATTTTTAA GTTTTACgAT CCAAATCAAA TATGGaTAAA ATTCgTATTA ACGCTCTACa 54 0 

ATGtTAATGA CTTCACCAGT ATATGCATCT GCATAAAAAT CATAATGAAT ATTTTGACCA 600 

TTTTTAATAG TTGTAATTCC ACCTTGATAA ACTAAACGGT ATTTATCAGT TTCAGGATGA 660 

A 661 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLCX;Y: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



10 





GCAGACGGTA 


CAGCAGTTAA AGTCGCACCA AaACTGTAGT 


GAATcTAATC 


GGTGcATTCT 


60 




TTTTAGGATT 


agttgtcgcg 


CTTATATATA 


TCTTCTTCAA 


AGTAATTTTC 


GATAAGCGAA 


120 




TTAAAGATGA 


AGAAGATGTA 


GAGAAAGAAT 


TAGGATTGCC 


TGTATTGGGT 


TCAATTCAAA 


180 




AATTTAATTA 


aggatggttg 


CTACTTATGT 


CAAAAAAGGA 


AAATACGACA 


ACAACACTAT 


240 




TTGTATATGA 


AAAACCAAAA 


TCAACAATTA 


GTGAAAAGTT 


TCGAGGTATA 


CGTTCAAACA 


300 


20 


TCATGTTTTC 


aaaagcaaat 


GGTGAAGTAA 


AGCGCTTATT 


GGTTACTTCT 


GAAAAGCCTG 


360 




GTGCAGGTAA 


aagtacagtt 


GTATCGAATG 


TAGCGATTAC 


TTATGCACAA 


GCAGGCTATA 


420 




AGACATTAGT 


TATTGATGGC GATATGCGTA AgcCAACACA AAACTATATT 


TTTAATGAGC 


430 


25 


AAAATAATAA 


TGGACTATCA 


AGCTTAATCA 


TTGGTCGAAC 


GACTATGTCA 


GAAGCAATTA 


540 




CGTCGACAGA 


AATTGAAAAT 


TTAGATTTGC 


TAACAGCTGG 


CCCTGTACCT 


CCAAATCCAT 


600 




CTGAGTTAAT 


TGGGTCTGAA 


AGGTTCAAAG 


AATTAGTTGA 


TCTGTTTAAT 


AAACGTTACG 


660 


30 


ACATTATTAT 


TGTCGATACA 


CCGCCAGTTA 


ATACTGTGAC 


TGATGCACAA 


CTATATGCGC 


720 




GTGCTATTAA 


AGATAGTCTG 


TTAGTAATTG 


ATAGTGAAAA 


AAATGATAAr 


AATGAAGTTA 


780 




AAAAAGCAAA 


AGCACTTATG 


GAAAAAGCAG 


GCAGTAACAT 


TCTAGGTGTC 


ATTTTGAACA 


840 


35 


AGACAAAGGT 


CGATAAATCT 


TCTAGTTATT 


ATCACTATTA 


TGGAGATGAA 


TAAGTATGAT 


900 




TGATATTCAT 


AACCATATAT 


TGCCTAATAT 


CGATGACGGT 


CCGACAAATG 


AAACAGAGAT 


960 


40 


GATGGATCTT 


TTAAAACAAG 


CGACAACACA 


AGGTGTTACA 


GAAATCATTG 


TAACATCACA 


1020 




TCACTTACAT 


CCTCGATATA 


CCACACCTAT 


AGAAAAAGTG 


AAATCATGTT 


TAAACCATAT 


1080 




TGAAAGCTTA 


GAGGAAGTAC 


AAGCACTAAA 


TCTAAAGTTT 


TATTATGGTC 


AGGAAATAAG 


1140 


45 


AATTACCGAT 


CAAATCCTTA 


ATGATATTGA 


TCGAAAAGTT ATTAACGGTA 


TTAATGATTC 


1200 




ACGCTATTTA 


CTAATAGAAT 


TTCCATCAAA 


TGAAGTTCCA 


CACTATACTG 


ATCAATTATt 


1260 




TTTCGAATtA 


CAGAGTAAAG 


GCTTTGTACC 


GATTATTGCA 


CATCCAGAGC 


GGAATAAAGC 


1320 


SO 


AATAAGTCAA 


AACCTTGACA 


TACTATACGA 


TTTAATTAAC 


AAAGGTGCTT 


TAAGTCAAGT 


1380 




GACAACGGcG 


TCATTAGCGG 


GTATTTCCGG 


TAAAAAAATT 


AGAAAATTAG 


CAATTCAAAT 


1440 
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GTTCTTAATG AAAGACTTAT TTAATGATAA GAAATTACGT GATTATTATG AAGATATGAA 1560 

CGGATTTATT AGTAATGCGA AGTTAGTTGT TGATGATAAA AAAATTCCTA AACGAATGCC 1620 

^ ACAACAAGAT TATAAACAGA AAAGATGGTT TGGGTTATAA ACAGCAAATG AGGGGTTTTA 1680 

TGGCACATTT ATCTGTGAAA TTGCGGCTTT TAATACTAGC ATTAATCGAT TCACTGATAG 174 0 

TGACATTTTC AGTATTCGTA AGTTATTACA TTTTAGAACC GTATTTCAAA ACATATTCTG 1800 

10 

TCAAATTATT AATATTGGCA GCTATATCAC TATTCATATC GCATCATATT TCaGCATTTA 1860 

TTTTTAATAT GTATCATCGA GCGTGGGAAT ATGCCAGTGT GAGTGAATTG ATTTTAATTG 1920 

TTAAAGCTGT GACGACATCT ATCGTTATTA CGATGGTGGT CGTGACAATT GTTACAGGCA 19 80 

15 

ATAGACCGTT TTTTAGATTG TATTTAATTA CTTGGATGAT GCACTTGATT TTAATAGGTG 204 0 

GCTCAAGGTT ATTTTGGCGT ATTTATCGGA AATACCTTGG AGGTAAGTCA TTTAATAAGA 2100 

20 AGCCAACTTT AGTTGTTGGT GCTGGTCAAG CAGGTTCAAT GCTGATTAGA CAAATGTTGA 2160 

AAAGTGACGA AATGAAACTT GAACCGGTAT TAGCAGTCGA TGATGACGAA CATAAACGCA 2220 

ATATCACAAT TACTGAGGGT GTAAAAGTCC AAGGTAAAAT TGCGGATATT CCAGAACTAG 2280 

25 TGAGGAAATA TAAGATTAAA AAAATCATCA TTGCAATTCC AACTATTGGT CAAGAGCGTT 2 34 0 

TGAAAGAAAT TAATAATATT TGCCATATGG ATGGCGTTGA GTTATTGAAA ATGCCAAATA 24 00 

TAGAAGACGT CATGTCTGGT GAGTTAGAAG TGAACCAACT TAAAAAAGTT GAAGTAGAAG 24 60 

30 ATTTACTAGG CAGAGATCCT GTTGAATTAG ATATGGATAT GATATCAAAT GAATTGACGA 252 0 

ATAAAACTAT TTTAGTTACG GGTGCAGGTG GTTCAATAGG ATCAGAAATT TGTAGACAAG 2580 

TTTGTAATTT CTATCCAGAA CGTATTATTC TACTTGGCCA TGGTGAAAAC AGTATTTATT 264 0 

35 

TAATCAATCG TGAATTGCGA AATCGCTTCG GwAAAAATGT TGATATCGTT CCTATTATAG 2700 

CGGATGTGCA AAATAGAGCG CGTATGTTTG AAATTATGGA AACGTATAAA CCATACGCAG 27G0 

TTTATCATGC AGCAGCACAC AAGCACGTGC CGTTAATGGA AGACAACCCT GAAGAAGCAG 2820 

TACGTAATAA TATTTTAGGT ACGAAAAATA CTGCTGAAGC TGCTAAAAAT GCAGAGGTAA 2S80 

AGAAATTCX3T TATGATTTCT ACGGATAAAG CCGTTAATCC GCCTAATGTC ATGGGAGCTT 2940 

CAAAGCGAAT TGCAGAAATG ATTATTCAAA GTTTAAATGA TGAAACGCAT CGAACAAATT 3000 

TTGTTGCAGT GAGATTTGGT AATGTACTTG GATCGAGAGG ATCTGTGATT CCACTTTTCA 3060 

AAAGTCAAAT TGAAGAAGGT GGGCCAGTTA CTGTGACACA TCCTGAAATG ACACGTTACT 312 0 

SO TTATGACAAT TCCTGAAGCT TCTAGACTAG TTTTGCAGGC AGGGGCATTA GCAGAAGGTG 318 0 

GCGAAGTATT TGTGCTAGAT ATGGGAGAAC CAGTGAAAAT TGTAGATTTG GCACGTAATT 324 0 
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CCGGCGAAAA AATGTTTGAA GAGCTTATGA ATAAAGATGA GGTTCATCCT GAACAAGTAT 3360 

TTGAAAAAAT TTATCGTGGC AAAGTACAAC ATATGAAATG TAATGAAGTT GAAGCGATTA 3420 

^ TTCAAGACAT CGTCAATGAC TTTAGTAAAG AAAAAATTAT TAACTATGCC AATGGCAAAA 34 80 

AGGGAGATAA TTATGTTCGA TGACAAAATT TTATTAATTA CTGGGGGCAC AGGATCATTC 3 54 0 

GGTAATGCTG TTATGAAACA GTTTTTAGAT TCTAATATTA AAGAAATTCG TATTTTTTCA 3 500 

10 

CGCGATGAGA AAAAACAAGA TGACATTCGA AAAAAATATA ATAATTCAAA ATTAAAGTTC 3 660 

TACATTGGTG ATGTGCGTGA TAGTCAAAGT GTAGAAACAG CAATGCGAGA TGTTGATTAC 3720 

GTATTCCATG CAGCAGCTTT AAAACAAGTG CCGTCATGTG AATTCTTTCC AGTTGAGGCA 378 0 

GTGAAGACAA ATATTATTGG TACAGAAAAT GTCTTACAAA GTGCTATTCA TCAAAATGTT 384 0 

AAAAAAGTCA TATGTTTATC TACAGATAAG GCAGCGTATC CTATTAATGC TAGGGGTATT 3900 

20 TCAAAAGCAA TGATGGAAAA AGTATTCGTA GCCAAATCAA GAAATATTCG TAGTGAACAA 3960 

ACGCTTATTT GTGGTACAAG ATACGGTAAT GTGATGGCTT CAAGAGGATC AGTAATACCT 4020 

TTGTTTATCG ACAAAATCAA AGCTGGAGAA CCTTTAACGA TTACAGATCC TGATATGACA 408 0 

25 AGATTTTTAA TGAGCTTAGA AGATGCGGTA GAACTAGTTG TTCATGCATT TAAGCATGCA 414 0 

GAGACAGGAG ATATTATGGT TCAAAAAGCA CCAAGCTCAA CGGTAGGGGA TCTTGCGACC 4 2 00 

GCATTATTAG AATTGTTTGA AGCTGATAAT GCAATTGAAA TCATTGGTAC GCGACATGGA 4 260 

^° GAGAAAAAAG CAGAAACATT GTTGACGAGA GAAGAATACG CACAATGTGA AGATATGGGT 4320 

GATTATTTTA GAGTGCCGGC AGACTCCAGA GATTTAAATT ATAGTAATTA TGTTGAAACC 4380 

GGTAACGAAA AGATTACGCA ATCTTATGAA TATAACTCCG ATAATACACA TATTTTAACG 4440 

35 

GTGGAAGAGA TAAAAGAAAA ACTTTTAACA CTAGAATATG TTAGAAACGA ATTGAATGAT 4 500 

TATAAAGCTT CAATGAGATA GGAGAGATTG ACGTTGAATA TTGTAATTAC AGGAGCAAAA 4560 

GGTTTTGTAG GAAAAAACTT GAAAGCAGAT TTAACTTCAA CGACAGATCA TCATATTTTC 4 620 

40 

GAAGTACATC GACAAACTAA AGAGGAAGAA TTAGAGTCAG CATTGTTGAA AGCAGACTTT 4 680 

GTCGTGCATT TAGCGGGTGT TAATCGACCT GAACATGACA AAGAATTCAG CTTAGGAAAC 4 740 

45 GTGAGTTATT TAGATCATGT ACTTGATATA TTAACTAGAA ATACGAAAAA GCCAGCGATA 4 800 

TTATTATCGT CTTCAATACA AGCAACACAA GATAATCCTT ATGGTGAGAG TAAGTTGCAA 4860 

GGGGAACAGC TATTAAGAGA GTATGCCGAA GAGTATGGCA ATACGGTTTA TATTTATCGC 4 92 0 

50 TGGCCAAATT TATTCGGCAA GTGGTGTAAG CCGAATTATA ACTCAGTGAT AGCAACATTT 4980 

TGTTACAAAA TTGCACGTAA CGAAGAGATT CAAGTTAATG ATCGGAATGT TGAACTAACG 504 0 
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ATTGAAAATG GTGTACCTAC AGTACCAAAC GTATTTAAAG TGACATTGGG AGAAATTGTA 516 0 

GATTTATTAT ACAAGTTCAA ACAGTCACGT CTCGATCGAA CATTGCCGAA ATTAGATAAC 5220 

^ TTGTTTGAAA AAGATTTGTA TAGTACGTAT TTAAGCTATC TACCTAGTAC aOACTTTAGT 52 80 

TAyCCCTTAC TTATGAATGT GGATGATAGG GGTTCTTTTA CAGAATTTAT AAAAACACCG 534 0 

GATCGTGGTC AAGTTTCTGT AAATATTTCT AAACCAGGTA TTACTAAAGG TAATCACTGG 5400 

10 

CATCATACTA AAAACGAAAA ATTTCTAGTC GTATCAGGTA AAGGGGTAAT TCGTTTTAGA 5460 

CATGTTAATG ATGATGAAAT CATTGAATAT TATGTTTCTG GCGACAAATT AGAAGTTGTA 5520 

GACATACCAG TAGGATACAC ACATAATATT GAAAATTTAG GCGACACAGA TATGGTAACT 5580 

ATTATGTGGG TGAATGAAAT GTTTGATCCA AATCAGCCAG ATACGTATTT CTTGGAGGTA 5640 

TAGCGCATGG aAAAACTGAA rTTAATGACA ATAGTTGGTA CAAGGCCTGA AATCATTCGT 5700 

20 TTATCATCAA CGATTAAAGC ATGTGATCAA TATtTTAA 5738 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9062 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

^° (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

ATCATCAACA AGAATGATAT TTTTCCCATC TACTATATCT TTTACCGCAG ATAACTTCAC 60 

TCTCACACCT TGCTCACGTA ATTCTTGAGT TGGTTGAATA AATGTTCTTG CAACATATTG 120 

35 

ATTTTTAACT AGTCCCATTT CATATGGCAA ACCTATTTCT TCAGCATAAC CACTCGCAGC 180 

TGA-I^GCCSAT gAATTGGGTA CACCGATGAC CATATCAGCA TTTACAGGGC TTTCTTGGGC 240 

TAATTTTTTA CCAGAAGCTT TACGTACTGC ATGGACATTT TTACCAGCTA TTGTTGAGTC 300 

TGGTCTAGCA AAATAAATAT ATTCCATCGC AGAAATTGCA GTTGTCGTAT GATGTGTATA 360 

AGATTTAACT GTAATACCTT TATCGTTAAT CACGACATAT TCACCTGCAT GAATATCTTG 420 

45 AACAAATTCT GCACCTAACA CATCTATTGC ACATGTTTCA CTTGCAAGGA TGTATGTCCC 480 

ATCTTTCATT TTACCTACAA CAAGTGGTCT GATAGCATTT GGATCTACTG CGCCATATAA 54 0 

CGCATCTTTA GTTAAAATCG CAAATGTAAA ACCGCCTTTA ACTTTTCGCA AACTTTCTTT 600 

^° CAACGCTTCC TCAAAAGTAG GAGCTTTACT TCGACGTATC AAATGCATAA TGACTTCAGT 660 

ATCAGAAGAC GAATGGAAGA TAGCACCTTG TTTTTCTAAA TTCTGACGCA ATGATTTAGC 720 



544 



[tiie.//t:>.\ip'.i-oieyMat.,Kaieniuocumenis\epuuuroDj la opoi 



EP 0 786 519 A2 

CGGTTGAATA TTTTCAATAC CTTTATTACC TGAAGTAGCA TAACGGACGT GACCAATTGC 840 

ATGTTGATAT CCTTTTAATC GTTCCATTTG ATCATCTTTA ATCGCTTCAG TTAGTAAGCC 90 0 

^ TAATCCTCGC TCGCCTTTTA ATTCATTTTG ATCAGAAACA ACTATACCTG cACCTTCTTG 960 

ACCACGATGT TGCAAACTAT GAAGTCCCAT ATAtGTTAGT TGCGCTGCtT CaGGATGATT 1020 

CCAAATACCA AACACGCCAC ATTCTTCGTT TAATCCTGAG TAGTTAAACA TTGaGCAATT 1080 

10 

GCCCCtTCCC ATATTTGTTT AATATCTGAA ACATTTTCAC TAATCTCTGT aTATGGTGTT 1140 

GTTACCTTGr aATTATCACT ATCTGTTAAA AGTCCAATTT CTATTGCATT ATCAATATTT 1200 

AAAGTTTTAC CTGATTTAAC AGAAACAACA TATCXK3CCTT GCGTCTCACT AAACAATTGT 1260 

GCATTTGTTA TATCTATTGA AGATTTTAAT CCTAAACCGT AATGCGCACT TAGTTTAGCT 1320 

AAGGTAATCA GTAAGCCACC TTTACCAACT GTTTGAACAT GTGATAATAG TCCTTCACGA 1380 

20 ATAGCGGTCT TGATTGATTC ACCTTTTTCA ACTTCTGAAC TCAAATCTAA TGACTCAAAT 1440 

TCATGATTAA CTTTGCCATA AATTAACTTT TCAAGTTGAC TACCACCAAA GTCGTCCTTA 1500 

GTATCACCGA TTAAATATAA TTTATCTCCA ACTTGAGGTT CAAAATCATT TAAATAATTT 1560 

25 ACATTTTCAA TCAAACCTAC CATTCCAACA ACTGGTGTTG GGAAAATAGA AGTACCTTTC 1620 

GTTTCGTTAT ATAAAGATAC ATTACCAGAA ACTACTGGTG TCTTAAGAAT GTCGCATGCT 1680 

TCTGCCATAC CTTTCGTTGA ATCTATCAAC TGTTGATAGA TTTCTTTCTT TTCAGGAGAA 174 0 

^° CCATAATTTA AACAATCTGT CATTGCTAAT GGTGTTGCAC CCACGGCAAT TAAATTTCGA 180 0 

TAAGCTTCAG CTACTACCAT CTTTCCACCT TCATATGGAT TGTTATATAC ATAACGCGCT 1860 

TCACCATCAA TTGTTGAAGC AATTGCCTTA TTTGTGCCTT CCACACGTAC TACCGATGCT 1920 

35 

TGAAGTCCTG GCTTAATTAT CGTATTGGCA CCAACTTGTT GGTCGTATTG ATCATATAAA 1980 

TAGTGTTTAG ATGCTATAGT CGGATGCTTA AGTAATTTAA AGAAAGTATC TTTAACATCG 2040 

ATGTGTGTAT AATCATTTTT AGAAGTATTA TAATCTTTTT CTTCTCCTTC TAAAATATAT 2100 

40 

ACAGGTGCTT CATCAGCTAG TGGTTCAACT GGAATGTCAG CATAAACTTC GTCATCATAT 2160 

GTTAAAACAA AACGATTTGT ATCTGTAACT TCACCTATAA CAGCACTATC CAATTCGTGC 2220 

45 TTATCAAATA AATCTAAGAA TTTTTGTTCA GTACCTTTTT CAACAACTAG TAACATACGT 2280 

TCTTGAGTTT CTGAAAGCAT CATTTCATAA GGAGAAATAC CTGGCTCACG TGTTGGCACT 2340 

TGTTCTAATC TCAAATGTAA CCCACTACCA CCTTTTGCCG CCATTTCAGA CGATGAAGAT 24 0 0 

50 GTTAAACCAG CAGCACCCAT ATCTTGAATA CCAACTAATT CATCAAATGT AATTGCTTCA 2460 

AGTGTTGCTT CCATTAATTT TTTACCTACA AATGGATCAC CGATTTGTAC AGAAGGTCGT 2520 
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CGACCAGTTT TCAAACCAAC ATAAATGACC GAATTACCTA CACCTTTTGC TGTGCCTTTT 264 0 

TGAATCATGT CGTGATTGaT AACACCAACA CACATTGCAT TAACAAGTGG ATTGCCATCA 2700 

TAACGTTCAT CAAATTCGAT TTCACCAGCA GTTGTTGGa.a. TACCAATGCA GTTACCATAA 27 60 

CCTCCGATAC CCTTTACAAC ACCTTTAAGT AATCTTTGGT TTTGTTTATT ATCTAATTCT 2 8 20 

CCAAATCTAA GACTGTTTAA CAAATTAATA GGTCTAGCCC CAATAGAGAC AATGTCACGA 2 88 0 

ATGATTCCAC CAACGCCTGT AGCAGCCCCT TGATATGGTT CAATTGCTGA TGGATGATTG 2 94 0 

TGAGACTCTA CTTTAAATAC TACGGCTTGA TTATCACCTA TATCGACTAC CCCTGCACCT 3000 

TCACCAGGCC CCATAAGCAC ATGGTcACCT GACGTAGGAA ATTGCTTTAA AAACGGTTTA 3060 

GAATGTTTAT AAGAGCAATG TTCACTCCAC ATAACAGAAA AGATACCTGT TTCTGTAAAG 3120 

TTAGGTTGTC TGCCTAAAAT ATCGCAAACT TTTTCATATT CTTGATCaCT TAATCCCATA 3180 

TCTTGATATA CTTTTTCAAG TTTAATTTCT TCAACGCTTG GTTCGATAAA TTTAGACATG 3240 

TTGTTCCCTC CAACTTTTTA CCATCGCTTC AAATAATTTC ACACCACTAT CAGTACCTAA 3300 

CAACGTTTCT AAAGCTCTTT CagGATGtGG CATCATGCCA CATACATTGC CTTTTTCGTT 3360 

AACAATTCCT GCAATATCAT CATATGAACC GTTCGGATTA TTCACATATT TCAGAATAAT 3420 

TTGATTGTTA GCTTTTAATT GTTGATATAT TTCATCAGTA CAATAATAAT GACCTTCACC 3480 

GTGAGCTACA GGATATATAA CTTTTTCACC TTGTTCATAA AGATTTGTAA ATGCCGTTTG 3 54 0 

ATTATTCACT ATTTCTAACT CTTCATTTCT ACTAATAAAT AAATGTGAAT CGTTATGCAA 3 6 00 

TAATGCACCA GGTAATAAGC CTATTTCAGT TAAAATTTGA AACCCATTAC AAACACCTAA 3 6 60 

TACTGGCTTA CCTTCAGCTG CAAGACGTTT AACTTCCGAA ATAATCGGsG CTACACTAGC 3720 

CATTGCCCCA GATCTTAAGT AATCCCCGAA TGAAAATCCA CCAGGAATAA GTACGCCATC 3780 

AAATGCACTT AGTGATGTTT CTCTATAATC TACATATTCC GCTTCAACAC CACTTTTAAT 3840 

AGCAGCATTA AACATGTCTC TATCACAATT CGAACCTGGA AAAACAAGAA CCGCAAATTT 3900 

CATTTTATGC ATTCTCCTTT TCATCATCTA ACACTTTATA GCTATATTCT TCAATCACTG 3960 

TATTTGCAAA CAATTTTTCA CTTAGAGTTG TAATAATGTT GTGTACCTTT TCATCACTAA 4020 

CCTCATCCAC TGTCATATAT AATACTTTTC CTACACGAAT ATCATTCACT TGTGCATAAC 4080 

CTAAGTCATG TACAGCTCGA GTAAGCGTTT GTCCTTGCGT ATCTAATACT TGTGGTTGTA 4140 

ATGTGATATG TAGTTCAATT GTTTTCATTA TTTTAAATCC TCCAATTTGT TTAAAAATAT 4200 

TTGATATGTT TCAATCAGTG ATCCAGTGTT ATTTCTATAT ACATCTTTAT CAAAGTTTGC 4260 

ATTGGTAGCT TTATCCCAAA TTCGACATGT ATCTGGAGAT ATTTCATCCG CTAACAAAAT 4320 
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ATCCATTAAT TGTTTCAACA 


CATTATTAAT 


CTTTAATGCT TTGGATTTTA 


GTATTTCAAT 


4440 




ATCTTCATCT GATGCTATAT 


TGAGCAATTT 


AACATGGTCA TCCGTTATCA 


ACGGATCATT 


4500 


s 


TAACGCATCA TTTTTATAGA 


AAAATTCTAC 


AAGTGGTTCT CTAAAAACTT 


CACCATTTTC 


4560 




AAAACCTAAA CGCTTTGTAA 


TAGATCCACT 


AGCAATATTA CGAACAACTA 


CTTCTAATGG 


4620 




AATTATTTTC ACAGGCTTAA 


CTAATTGTTC 


TGTTTCAGAT AATTGTTTAA 


TAAAGTGACT 


4680 




TTCTATTCCA TTTTCTTGTA 


AATATTTAAA 


TATAATAGAA GTAATTTGAT 


TATTTAATCG 


4740 




CCCCTTACCT GCCATTGTGT 


CTTTCTTAGC 


CCCGTTTCCA GCAGTAACTT 


CATCTTTATA 


4800 


IS 


TTCAACTCTT AATTCATTTT 


CTTGATTTGT 


TGAGAAAATG CGcTTCGCTT 


TTCCTTCATA 


4860 




TAATAATGTC ATGCTTTAAT 


TACTCCCCTC 


AAATTTAGCG TACATATCTT 


GTTCAGTTTG 


4920 




GTTTACATCA TTCGTTAGTA 


CAGTCATATG 


CCCCATTTTT CTGCTATCTT 


TACGCTCAGA 


4980 


20 


CTTACCATAA ATATGTAAGT 


GCCACTCTGG 


ATGTTCATTA AATTCATTTT 


CCAATAAATC 


5040 




TAAATCTTTA CCTAGTAAGT 


TCATCATGAC 


TGCTGGCTTT AATAATTCAA TTGAATTTGG 


5100 




TAATGATTGT CCGGTAACTG 


CTAAAATATG 


AGTATCAAAT TGTGAATAAT 


CACATGCTTC 


5160 


25 


AATTGAATAA TGTCCGGAAT 


TGTGAGGCCT 


TGGTGCTATC TCGTTCACAT 


ACAATTGGTT 


5220 




GTTACTATCT ATAAAAAATT 


CAACTGTAAA 


TGTTCCAATG AAATGAATCG 


ATTGGATAAT 


5280 




TTTATTAACT TGCTCTTTCG 


CCTCAGCTGT 


TTTATCTATT CTCGCTGGAA 


CAATTGTTTT 


5340 


30 


GAAAAGTATT TGATTTCTAT 


GCTCATTTTC 


TTGTAATGGG AAAAAAGTGA 


TTTGATTGTT 


5400 




GTTTCCTCTT GTAACAGTAA 


GAGATACTTC 


TTTCTTGATA TTCAAATATT 


TTTCAGCTAC 


5460 




GCATTCACTA GTTTCAATTA 


ATTTAAAACC 


TTCTTGTAAG TCTTTTTCGT 


TGTTAATTAA 


5520 


35 


AACTTGACCT TTGCCATCGT 


AGCCACCAAA 


TCTAGTTTTT ACAATAAAAG 


GATATCCTAA 


5580 




TGnffCAATT GCTTTGTCAA 


TATCTGTAGA 


TTCTTTTACT GAAATGAACG 


GGACAACTTT 


5640 




GGTACCAGCA CTTTTTAATG 


TTTCTTTTTC 


AGTTAAGCGA TCTTGTAATA 


ACTGTATAGC 


5700 


TTGGTAACCT TGCGGAATAT 


TGTACTTTTC 


ACATAATAGT TTTAATTGTT 


GGGCTGAAAT 


5760 




GTTTTCAAAT TCATAAGTAA 


TCACATCACA 


TTTTTGTCCT AATTGATTGA 


GTGCCTTTTC 


5820 


45 


ATCGTCATAC TTGGCTTGTA 


TAAATTCGTG 


TGCAACGTAT CTACATGGAC 


AATCTTCAGA 


5B80 




AGGATCCAAT ACAACCACTT 


TATAACCCAT 


TTTTTGAGCT GATTGTGCCA 


TCATCTTTCC 


5940 




AAGCTGACCA CCACCAATAA 


TGCCAATAGT 


CGCACCAAAC TTTAATTTAT 


TGAAGTTCAT 


6000 


SO 


TTTGCATGTC CTCCACTTTT 


TGAATTAACG 


AAGATTCATA CTGATTTAGT 


TTTTCAACTA 


6060 




AAGAAGGATT TTGAATACTT 


AACATTCTTG 


CTGCAAGTAT ACCTGCGTTT 


TTAGCACCTG 


6120 
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AAGPATCTAT ACCCTTTAAA 


CTTTTTGTTT 


CAATCGGCAC TCCAATAACT 


GGTAGCGTCG 


6240 




TTAATGATGC AACCATACCT 


GGTAAATGTG 


CCGCACCGCC AGCGCCTGCA 


ATGATAATGT 


6300 


5 


TTATACCTCT TTCTCTCGCT 


TCAGAAGCAA 


ATTGAACCAT CATTTTTGGC 


GTACGATGTG 


6360 




CGGATACTAC TTGTTTTTCG 


TACGGAATTT 


CAAAATAATC CAACATGTTA 


CAACTCTCTT 


6420 




GCATAATTTT CCAATCGGAA 


GAACTGCCCA 


TAATGACTGC TACTTTCACT 


TTGTACACCC 


6480 


W 


TTTCAAAAGT TTGAATTGTG 


AATTACTTTA 


GTTGTATATT ATAGATATAG 


CATAACAAGC 


6540 




AATTTCTGCT TTTTCAATCA 


AAAATCGAAC 


TTTATTTTGA TTTTTTATTT 


GAATTTACGT 


6600 




CTTTTGCTAT GTAAATTAGT TTTATAAACT AACAAAGTTA GGATATTGAC AATAGGAGGA 


6660 




GAAGTTTTTA TGGTTGCTAA 


AATTTTAGAT 


GGTAAACAAA TTGCCAAAGA 


CTACAGACAG 


6720 




GGGTTACAAG ATCAAGTTGA 


AGCGCTAAAA 


GAAAAGGGTT TTACACCTAA 


ATTATCCGTT 


6780 


SO 


ATATTAGTTG GTAATGATGG 


CGCTAGTCAA 


AGTTATGTTA GATCAAAAAA 


GAAAGCAGCT 


6840 




GAAAAAATTG GTATGATTTc 


AGAAATCGTA 


CATTTGGAAG AAACAGCTAC 


TGAAQAAGAA 


6900 




GTATTAAACG AACTAAATAG 


ACTAAATAAT 


GATGATTCTG TAAGTGGTAT 


TTTGGTACAA 


6960 


2S 


GTACCATTAC CAAAACAAGT 


TAGCGAACAG 


AAAATATTAG AAGCAATCAA 


TCCTGAAAAA 


7020 




GATGTGGACG GTTTTCATCC 


AATAAATATA 


GGGAAATTAT ATATCGATGA 


ACAAACTTTT 


7080 




GTACCTTGCA CACCGCTCGG 


CATCATGGAA 


ATATTAAAAC ATGCTGATAT 


TGATTTAGAA 


7140 


30 


GGTAAAAATG CAGTTGTAAT 


TGGACGAAGT 


CATATTGTCG GACAACCAGT 


TTCTAAGTTA 


7200 




CTACTTCAAA AAAATGCATC 


AGTAACAATC 


TTACATTCTC GTTCAAAAGA 


TATGGCATCA 


7260 




TATTTAAAAG ATGCTGATGT 


CATTGTCAGT 


GCAGTTGGTA AGCCTGGTTT 


AGTAACAAAA 


7320 


35 


GATGTGGTCA AAGAAGGAGC 


AGTAATTATC 


GATGTTGGCA ATACGCCAGA 


TGAAAATGGC 


7380 




AAATTAAAAG GTGACGTTGA 


TTATGATGCG 


GTTAAAGAAA TTGCTGGAGC 


TATTACACCA 


7440 




GTTCCTGGTG GCGTTGGTCC 


ATTAACAATT 


ACTATGGTAT TAAATAATAC 


TTTGCTTGCA 


7500 


GAAAAAATGC GTCGAGGTAT 


TGATTCGTAA 


AGAGCCTGAG ACATAAATCA 


ATGTTCTATG 


7560 




CTCTACAAAG TTATAATGGC 


AGTAGTTGAC 


TGAACGAAAA TTCGCTTGTA 


ACAAGCTTTT 


7620 


45 


TTCAATTCTA GTCAACCTTG 


CCGGGGTGGG 


ACGACGAAAT AAATTTTACG 


AAAATATCAT 


7680 




TTCTGTCCCA CTCCCTAATA 


ACTGAGTTTT 


AATGAAGTCT TTTAACCCAC 


ATTAAATATT 


7740 




ATTTTGCAAT TGCAATGAAT 


AACAAGAAAA 


ATCTGGGACA TTAATCGATC 


AAATGCTCCC 


7800 


50 


TTCAAAGTAG ACATTGAATA 


AATGAAGGCT 


TTGAAGGGAG CATTTCACTT 


TGTACTTGGC 


7860 




TCAACAATTT TATATAGACA 


GTAGTTAATT GAATGAAAAT AAGCTTGTAA 


CAAGTTTTCA 


7920 
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EP0 786 519 A2 

GTTGGGGATG GGCCCCAACA CAGAAGCTGT GACTATGATA AAGTACTACT ACATAGTTAA 8 04 0 

TCATTAGTGG TTCTTTATCA TTTTCGCCTC CCTTTTCTTA TTGTTTTGAT ACACAAAAAT 810 0 

5 

TTAAGTTCAA ACTGTCGAAT AAAGTTATAT TTGATTTCAA ATTATCCCTA AATTATTAAT 8160 

TkTACAATTG TGGCAGATTT TCAAAATAAT AATTATTTCC TCATTATTTA TAAATTTATA 8 22 0 

TTTAAATTTC ATTCTTTATA GGGTAAGATT AGGACTATAG TATGATGTGT ArATAATATA 8280 

10 

AATTAAGGTA TAGTAAAGCT AACTCAGAAA TGACTTATCA TTCGGAGGTT ACATTATGAA 834 0 

TAAACTATTA CAGTCATTAT CAGCCCTCGG TGTTTCTGCT ACACTAGTAA CACCAAATTT 8400 

,5 AAATGCAGAT GCAACGACGA ATACTACACC ACAAATTAAA GGCGCTAATG ATATCGTTAT 84 60 

TAAGAAAC3GT CAAGATTATA ACCTTCTAAA CGGCATAAGT GCATTTGATA AAGAAGATGG 8520 

AGATTTAACC GATAAAATTA AAGTCGATGG CCAAATTGAT ACATCTAAAT CTGGTAAATA 85 80 

20 TCAAATTAAA TATCATGTCA CTGATTCAGA TGGTGCAATT AAAATTTCCA CTAGGTATAT 864 0 

TGAGGTTAAA TAGCCCTCAT CACTATACTG CAAATAAAAT GGTAGCAAAC GAACATGTTT 8 700 

TGCTACCATT TTATTTGTTA TTCTAACTTC ATCTGCAACT TTAACCCAAA TATTGTATTT 8760 

25 TTTCTGTATA CCAAAGGACT ACCTATCAAA TTATTAAAAC TTAACTGCTC TTTTTAAAAA 8 82 0 

AATGTTTTGA TTTTGAACAA ACAAATTTCC ACTTTTCATT GTTTAACGAT AAATTACTTT 8 8 80 

TGGCAAATTC CTTATTAAAA TGTTTGCGCT TCCTTTCAAT CAACTAGCCA TCATTTTCAA 894 0 

30 

TTTATTAGAC AATTTCAAAC TTTTTTTATT TTCATTCAAT TAACCTTTAA TTGAAAGCTA 9000 

TTCTCAACTT TCCTTTTAAA TATGAAGCAA TTTTTTCAAA AACGCTATTA GTCACAAAAT 9060 

GT 9062 

35 

(2) INFORMATION FOR SEQ ID NO: 8 6: 

.-(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 73 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

AAATATTTTT TCAAAACTAT GTGAAAATGG aCCATGTCtA aATCATGTAA TAATGCAGyA 6 0 

CATAATGCCA ACGGTCTmTC TTTATTGTCC CATGCATCAT GACCAATAAA TGACTCATCA 120 

^° ATTAATCGTC TAACTATTTC ATACACACCT AAAGAATGTC CAAAGCGACT ATGTTCTGCT 180 

GTGTGAAAAG ATAGGTACAG TGTTCCTAGT TGTCTAATTC GACGTAACCT TTGGAATTCC 240 
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